GUC Parameters

This session introduces PG-Strom's configuration parameters.

Enables/disables a particular feature

pg_strom.enabled [type: bool / default: on]
Enables/disables entire PG-Strom features at once
pg_strom.enable_gpuscan [type: bool / default: on]
Enables/disables GpuScan
pg_strom.enable_gpuhashjoin [type: bool / default: on]
Enables/disables JOIN by GpuHashJoin
pg_strom.enable_gpugistindex [type: bool / default: on]
Enables/disables JOIN by GpuGiSTIndex
pg_strom.enable_gpujoin [type: bool / default: on]
Enables/disables entire GpuJoin features (including GpuHashJoin and GpuGiSTIndex)
pg_strom.enable_gpupreagg [type: bool / default: on]
Enables/disables GpuPreAgg
pg_strom.enable_gpusort [type: bool / default: on]
Enables/disables GPU-Sort
Check (here)[gpusort.md] to know the detail of GPU-Sort.
pg_strom.enable_numeric_aggfuncs [type: bool / default: on]
Enables/disables support of aggregate function that takes numeric data type.
Note that aggregated function at GPU mapps numeric data type to 128bit fixed-point variable, so it raises an error if you run the aggregate functions with extremely large or highly-precise values. You can turn off this configuration to enforce the aggregate functions being operated by CPU.
pg_strom.enable_brin [type: bool / default: on]
Enables/disables BRIN index support on tables scan
pg_strom.cpu_fallback [type: enum / default: notice]
Controls whether it actually run CPU fallback operations, if GPU program returned "CPU ReCheck Error"
notice ... Runs CPU fallback operations with notice message
on, true ... Runs CPU fallback operations with no message output
off, false ... Disabled CPU fallback operations with an error
pg_strom.regression_test_mode [type: bool / default: off]
It disables some EXPLAIN command output that depends on software execution platform, like GPU model name. It avoid "false-positive" on the regression test, so use usually don't tough this configuration.
pg_strom.explain_developer_mode [型: bool / 初期値: off]
Among the various information displayed by EXPLAIN VERBOSE, this option displays information that is useful for developers. Since this information is cumbersome for general users and DB administrators, we recommend that you usually leave it at the default value.

Optimizer Configuration

pg_strom.gpu_setup_cost [type: real / default: 100 * DEFAULT_SEQ_PAGE_COST]
Cost value for initialization of GPU device
pg_strom.gpu_tuple_cost [type: real / default: DEFAULT_CPU_TUPLE_COST]
Cost value to send tuples to, or receive tuples from GPU for each.
pg_strom.gpu_operator_cost [type: real / default: DEFAULT_CPU_OPERATOR_COST / 16]
Cost value to process an expression formula on GPU. If larger value than cpu_operator_cost is configured, no chance to choose PG-Strom towards any size of tables
pg_strom.enable_partitionwise_gpujoin [type: bool / default: on]
Enables/disables whether GpuJoin is pushed down to the partition children.
pg_strom.enable_partitionwise_gpupreagg [type: bool / default: on]
Enables/disables whether GpuPreAgg is pushed down to the partition children.
pg_strom.pinned_inner_buffer_threshold [type: int / 初期値: 0]
If the INNER table of GpuJoin is either GpuScan or GpuJoin, and the estimated size of its processing result is larger than this configured value, the result is retained on the GPU device without being returned to the CPU, and then reused as a part of the INNER buffer of the subsequent GpuJoin.
If the configured value is 0, this function will be disabled.
pg_strom.pinned_inner_buffer_partition_size [type: int / default: auto]
When using Pinned Inner Buffer with GPU-Join, if the buffer size exceeds the threshold specified by this parameter, it will attempt to split the buffer into multiple pieces. This parameter is automatically set to about 70% to 80% of the GPU memory, and usually does not need to be specified by the user.
Check (here)[http://buri.heterodb.in/operations/#inner-pinned-buffer-of-gpujoin] for more information.
pg_strom.extra_ereport_level [type: int / default: auto]
Specifies the error level reported by the heterodb-extra module from 0 to 2.
The initial value is set by the value of the environment variable HETERODB_EXTRA_EREPORT_LEVEL, and if not set, it will be 0.

Executor Configuration

pg_strom.max_async_tasks [type: int / default: 12]
Max number of asynchronous taks PG-Strom can submit to the GPU execution queue, and is also the number of GPU Service worker threads.

GPUDirect SQL Configuration

pg_strom.gpudirect_driver [type: text]
It shows the driver software name of GPUDirect SQL (read-only).
Either cufile, nvme-strom or vfs
pg_strom.gpudirect_enabled [type: bool / default: on]
Enables/disables GPUDirect SQL feature.
pg_strom.gpu_direct_seq_page_cost [type: real / default: DEFAULT_SEQ_PAGE_COST / 4]
The cost of scanning a table using GPU-Direct SQL, instead of the seq_page_cost, when the optimizer calculates the cost of an execution plan.
pg_strom.gpudirect_threshold [type: int / default: auto]
Controls the table-size threshold to invoke GPUDirect SQL feature.
The default is auto configuration; a threshold calculated by the system physical memory size and shared_buffers configuration.
pg_strom.manual_optimal_gpus [type: text / default: none]
It manually configures the closest GPU for the target storage volumn, like NVME device or NFS volume.
Its format string is: {<nvmeX>|/path/to/tablespace}=gpuX[:gpuX...]. It describes relationship between the closest GPU and NVME device or tablespace directory path. It accepts multiple configurations separated by comma character.
Example: pg_strom.manual_optimal_gpus = 'nvme1=gpu0,nvme2=gpu1,/mnt/nfsroot=gpu0'
  • <gpuX> means a GPU with device identifier X.
  • <nvmeX> means a local NVME-SSD or a remote NVME-oF device.
  • /path/to/tablespace means full-path of the tablespace directory.

Automatic configuration is often sufficient for local NVME-SSD drives, however, you should manually configure the closest GPU for NVME-oF or NFS-over-RDMA volumes.

Arrow_Fdw Configuration

arrow_fdw.enabled [type: bool / default: on]
By adjustment of estimated cost value, it turns on/off Arrow_Fdw. Note that only Foreign Scan (Arrow_Fdw) can scan on Arrow files, if GpuScan is not capable to run on.
arrow_fdw.stats_hint_enabled [type: bool / default: on]
When Arrow file has min/max statistics, this parameter controls whether unnecessary record-batches shall be skipped, or not.
arrow_fdw.metadata_cache_size [type: int / default: 512MB]
Size of shared memory to cache metadata of Arrow files.
Once consumption of the shared memory exceeds this value, the older metadata shall be released based on LRU.

GPU Cache configuration

pg_strom.enable_gpucache [type: bool / default: on]
Controls whether search/analytic query tries to use GPU Cache.
Note that GPU Cache trigger functions continue to update the REDO Log buffer, even if this parameter is turned off.
pg_strom.gpucache_auto_preload [type: text / default: null]
It specifies the table names to be loaded onto GPU Cache just after PostgreSQL startup.
Its format is DATABASE_NAME.SCHEMA_NAME.TABLE_NAME, and separated by comma if multiple tables are preloaded.
Initial-loading of GPU Cache usually takes a lot of time. So, preloading enables to avoid delay of response time of search/analytic queries on the first time.
If this parameter is '*', PG-Strom tries to load all the configured tables onto GPU Cache sequentially.

GPU Device Configuration

pg_strom.gpu_mempool_segment_sz [type: int / default: 1GB]
The segment size when GPU Service allocates GPU device memory for the memory pool.
GPU device memory allocation is a relatively heavy process, so it is recommended to use memory pools to reuse memory.
pg_strom.gpu_mempool_max_ratio [type: real / default: 50%]
It specifies the percentage of device memory that can be used for the GPU device memory memory pool.
It works to suppress excessive GPU device memory consumption by the memory pool and ensure sufficient working memory.
pg_strom.gpu_mempool_min_ratio [type: real / default: 5%]
It specify the percentage of GPU device memory that is preserved as the memory pool segment, and remained even after memory usage.
By maintaining a minimum memory pool, the next query can be executed quickly.
pg_strom.gpu_mempool_release_delay [type: int / default: 5000]
GPU Service does not release a segment of a memory pool immediately, even if it becomes empty. When the time specified by this parameter (in milliseconds) has elapsed since the segment was last used, it is released and returned to the system.
By inserting a certain delay, you can reduce the frequency of GPU device memory allocation/release.
pg_strom.cuda_toolkit_basedir [type: text / default: /usr/local/cuda]
PG-Strom uses the CUDA Toolkit to build GPU code on its start-up. Specify the installation path of the CUDA Toolkit to be used at that time. Normally, the CUDA toolkit is installed under /usr/local/cuda, but if you want to use a different directory, you can change the setting with this parameter.
pg_strom.cuda_stack_limit [type: int / default: 32]
When PG-Strom executes SQL workloads on the GPU, it automatically configures the size of the stack space used by GPU threads depending on the complexity of the processing. For example, it allocates a relatively large stack for PostGIS functions or expressions that include recursive calls.
This parameter specifies the upper limit in kB in that case.
pg_strom.cuda_visible_devices [type: text / default: null]
List of GPU device numbers in comma separated, if you want to recognize particular GPUs on PostgreSQL startup.
It is equivalent to the environment variable CUDAVISIBLE_DEVICES