GUC Parameters

This session introduces PG-Strom's configuration parameters.

Enables/disables a particular feature

pg_strom.enabled [type: bool / default: on]
Enables/disables entire PG-Strom features at once
pg_strom.enable_gpuscan [type: bool / default: on]
Enables/disables GpuScan
pg_strom.enable_gpuhashjoin [type: bool / default: on]
Enables/disables JOIN by GpuHashJoin
pg_strom.enable_gpugistindex [type: bool / default: on]
Enables/disables JOIN by GpuGiSTIndex
pg_strom.enable_gpujoin [type: bool / default: on]
Enables/disables entire GpuJoin features (including GpuHashJoin and GpuGiSTIndex)
pg_strom.enable_gpupreagg [type: bool / default: on]
Enables/disables GpuPreAgg
pg_strom.enable_numeric_aggfuncs [type: bool / default: on]
Enables/disables support of aggregate function that takes numeric data type.
Note that aggregated function at GPU mapps numeric data type to 128bit fixed-point variable, so it raises an error if you run the aggregate functions with extremely large or highly-precise values. You can turn off this configuration to enforce the aggregate functions being operated by CPU.
pg_strom.enable_brin [type: bool / default: on]
Enables/disables BRIN index support on tables scan
pg_strom.cpu_fallback [type: enum / default: notice]
Controls whether it actually run CPU fallback operations, if GPU program returned "CPU ReCheck Error"
notice ... Runs CPU fallback operations with notice message
on, true ... Runs CPU fallback operations with no message output
off, false ... Disabled CPU fallback operations with an error
pg_strom.regression_test_mode [type: bool / default: off]
It disables some EXPLAIN command output that depends on software execution platform, like GPU model name. It avoid "false-positive" on the regression test, so use usually don't tough this configuration.

Optimizer Configuration

pg_strom.gpu_setup_cost [type: real / default: 100 * DEFAULT_SEQ_PAGE_COST]
Cost value for initialization of GPU device
pg_strom.gpu_tuple_cost [type: real / default: DEFAULT_CPU_TUPLE_COST]
Cost value to send tuples to, or receive tuples from GPU for each.
pg_strom.gpu_operator_cost [type: real / default: DEFAULT_CPU_OPERATOR_COST / 16]
Cost value to process an expression formula on GPU. If larger value than cpu_operator_cost is configured, no chance to choose PG-Strom towards any size of tables
pg_strom.enable_partitionwise_gpujoin [type: bool / default: on]
Enables/disables whether GpuJoin is pushed down to the partition children.
pg_strom.enable_partitionwise_gpupreagg [type: bool / default: on]
Enables/disables whether GpuPreAgg is pushed down to the partition children.
pg_strom.pinned_inner_buffer_threshold [type: int / 初期値: 0]
If the INNER table of GpuJoin is either GpuScan or GpuJoin, and the estimated size of its processing result is larger than this configured value, the result is retained on the GPU device without being returned to the CPU, and then reused as a part of the INNER buffer of the subsequent GpuJoin.
If the configured value is 0, this function will be disabled.

Executor Configuration

pg_strom.max_async_tasks [type: int / default: 12]
Max number of asynchronous taks PG-Strom can submit to the GPU execution queue, and is also the number of GPU Service worker threads.

GPUDirect SQL Configuration

pg_strom.gpudirect_driver [type: text]
It shows the driver software name of GPUDirect SQL (read-only).
Either cufile, nvme-strom or vfs
pg_strom.gpudirect_enabled [type: bool / default: on]
Enables/disables GPUDirect SQL feature.
pg_strom.gpu_direct_seq_page_cost [type: real / default: DEFAULT_SEQ_PAGE_COST / 4]
The cost of scanning a table using GPU-Direct SQL, instead of the seq_page_cost, when the optimizer calculates the cost of an execution plan.
pg_strom.gpudirect_threshold [type: int / default: auto]
Controls the table-size threshold to invoke GPUDirect SQL feature.
The default is auto configuration; a threshold calculated by the system physical memory size and shared_buffers configuration.
pg_strom.manual_optimal_gpus [type: text / default: none]
It manually configures the closest GPU for the target storage volumn, like NVME device or NFS volume.
Its format string is: {<nvmeX>|/path/to/tablespace}=gpuX[:gpuX...]. It describes relationship between the closest GPU and NVME device or tablespace directory path. It accepts multiple configurations separated by comma character.
Example: pg_strom.manual_optimal_gpus = 'nvme1=gpu0,nvme2=gpu1,/mnt/nfsroot=gpu0'
  • <gpuX> means a GPU with device identifier X.
  • <nvmeX> means a local NVME-SSD or a remote NVME-oF device.
  • /path/to/tablespace means full-path of the tablespace directory.

Automatic configuration is often sufficient for local NVME-SSD drives, however, you should manually configure the closest GPU for NVME-oF or NFS-over-RDMA volumes.

Arrow_Fdw Configuration

arrow_fdw.enabled [type: bool / default: on]
By adjustment of estimated cost value, it turns on/off Arrow_Fdw. Note that only Foreign Scan (Arrow_Fdw) can scan on Arrow files, if GpuScan is not capable to run on.
arrow_fdw.stats_hint_enabled [type: bool / default: on]
When Arrow file has min/max statistics, this parameter controls whether unnecessary record-batches shall be skipped, or not.
arrow_fdw.metadata_cache_size [type: int / default: 512MB]
Size of shared memory to cache metadata of Arrow files.
Once consumption of the shared memory exceeds this value, the older metadata shall be released based on LRU.

GPU Cache configuration

pg_strom.enable_gpucache [type: bool / default: on]
Controls whether search/analytic query tries to use GPU Cache.
Note that GPU Cache trigger functions continue to update the REDO Log buffer, even if this parameter is turned off.
pg_strom.gpucache_auto_preload [type: text / default: null]
It specifies the table names to be loaded onto GPU Cache just after PostgreSQL startup.
Its format is DATABASE_NAME.SCHEMA_NAME.TABLE_NAME, and separated by comma if multiple tables are preloaded.
Initial-loading of GPU Cache usually takes a lot of time. So, preloading enables to avoid delay of response time of search/analytic queries on the first time.
If this parameter is '*', PG-Strom tries to load all the configured tables onto GPU Cache sequentially.

GPU Device Configuration

pg_strom.gpu_mempool_segment_sz [type: int / default: 1GB]
The segment size when GPU Service allocates GPU device memory for the memory pool.
GPU device memory allocation is a relatively heavy process, so it is recommended to use memory pools to reuse memory.
pg_strom.gpu_mempool_max_ratio [type: real / default: 50%]
It specifies the percentage of device memory that can be used for the GPU device memory memory pool.
It works to suppress excessive GPU device memory consumption by the memory pool and ensure sufficient working memory.
pg_strom.gpu_mempool_min_ratio [type: real / default: 5%]
It specify the percentage of GPU device memory that is preserved as the memory pool segment, and remained even after memory usage.
By maintaining a minimum memory pool, the next query can be executed quickly.
pg_strom.gpu_mempool_release_delay [type: int / default: 5000]
GPU Service does not release a segment of a memory pool immediately, even if it becomes empty. When the time specified by this parameter (in milliseconds) has elapsed since the segment was last used, it is released and returned to the system.
By inserting a certain delay, you can reduce the frequency of GPU device memory allocation/release.
pg_strom.gpuserv_debug_output [type: bool / default: false]
Enable/disable GPU Service debug message output. This message may be useful for debugging, but normally you should not change it from the default value.
pg_strom.cuda_visible_devices [type: text / default: null]
List of GPU device numbers in comma separated, if you want to recognize particular GPUs on PostgreSQL startup.
It is equivalent to the environment variable CUDAVISIBLE_DEVICES