GUC Parameters

This session introduces PG-Strom's configuration parameters.

Enables/disables a particular feature

Parameter Type Default Description
pg_strom.enabled bool on Enables/disables entire PG-Strom features at once
pg_strom.enable_gpuscan bool on Enables/disables GpuScan
pg_strom.enable_gpuhashjoin bool on Enables/disables GpuJoin by HashJoin
pg_strom.enable_gpunestloop bool on Enables/disables GpuJoin by NestLoop
pg_strom.enable_gpupreagg bool on Enables/disables GpuPreAgg
pg_strom.enable_brin bool on Enables/disables BRIN index support on tables scan
pg_strom.enable_partitionwise_gpujoin bool on Enables/disables whether GpuJoin is pushed down to the partition children. Available only PostgreSQL v10 or later.
pg_strom.enable_partitionwise_gpupreagg bool on Enables/disables whether GpuPreAgg is pushed down to the partition children. Available only PostgreSQL v10 or later.
pg_strom.pullup_outer_scan bool on Enables/disables to pull up full-table scan if it is just below GpuPreAgg/GpuJoin, to reduce data transfer between CPU/RAM and GPU.
pg_strom.pullup_outer_join bool on Enables/disables to pull up tables-join if GpuJoin is just below GpuPreAgg, to reduce data transfer between CPU/RAM and GPU.
pg_strom.enable_numeric_aggfuncs bool on Enables/disables support of aggregate function that takes numeric data type.
pg_strom.cpu_fallback bool off Controls whether it actually run CPU fallback operations, if GPU program returned "CPU ReCheck Error"
pg_strom.regression_test_mode bool off It disables some EXPLAIN command output that depends on software execution platform, like GPU model name. It avoid "false-positive" on the regression test, so use usually don't tough this configuration.

Optimizer Configuration

Parameter Type Default Description
pg_strom.chunk_size int 65534kB Size of the data blocks processed by a single GPU kernel invocation. It was configurable, but makes less sense, so fixed to about 64MB in the current version.
pg_strom.gpu_setup_cost real 4000 Cost value for initialization of GPU device
pg_strom.gpu_dma_cost real 10 Cost value for DMA transfer over PCIe bus per data-chunk (64MB)
pg_strom.gpu_operator_cost real 0.00015 Cost value to process an expression formula on GPU. If larger value than cpu_operator_cost is configured, no chance to choose PG-Strom towards any size of tables

Executor Configuration

Parameter Type Default Description
pg_strom.max_async_tasks int 5 Number of asynchronous taks PG-Strom can throw into GPU's execution queue per process. If CPU parallel is used in combination, this limitation shall be applied for each background worker. So, more than pg_strom.max_async_tasks asynchronous tasks are executed in parallel on the entire batch job.
pg_strom.reuse_cuda_context bool off If on, it tries to reuse CUDA context, constructed according to the previous query execution, on the next query execution. Usually, construction of CUDA context takes 100-200ms, it may improve queries response time, on the other hands, it continue to occupy a part of GPU device memory on the down-side. So, we don't recommend to enable this parameter expect for benchmarking and so on.
Also, this configuration makes no sense if query uses CPU parallel execution.

GPUDirect SQL Configuration

Parameter Type Default Description
pg_strom.gpudirect_driver text auto It shows the driver software name of GPUDirect SQL (read-only).
pg_strom.gpudirect_enabled bool on Enables/disables GPUDirect SQL feature.
pg_strom.gpudirect_threshold int auto Controls the table-size threshold to invoke GPUDirect SQL feature.
pg_strom.cufile_io_unitsz int 16MB Unit size of read-i/o when PG-Strom uses cuFile API. No need to change from the default setting for most cases. It is available only if PG-Strom was built with WITH_CUFILE=1.
pg_strom.nvme_distance_map string NULL It manually configures the closest GPU for particular NVME devices. Its format string is <nvmeX>:<gpuX>[,...]; comma separated list of NVME-GPU pairs. (Examle: nvme0:gpu0,nvme1:gpu0)
Automatic configuration is often sufficient for local NVME-SSD drives, on the other hands, you need to configure the closest GPU manually, if NVME-oF devices are in use.

Arrow_Fdw Configuration

Parameter Type Default Description
arrow_fdw.enabled bool on By adjustment of estimated cost value, it turns on/off Arrow_Fdw. Note that only Foreign Scan (Arrow_Fdw) can scan on Arrow files, if GpuScan is not capable to run on.
arrow_fdw.metadata_cache_size int 128MB Size of shared memory to cache metadata of Arrow files.
Once consumption of the shared memory exceeds this value, the older metadata shall be released based on LRU.
arrow_fdw.record_batch_size int 256MB Threshold of RecordBatch when Arrow_Fdw foreign table is written. When total amount of the buffer size exceeds this configuration, Arrow_Fdw writes out the buffer to Apache Arrow file, even if INSERT command is not completed yet.

Gstore_Fdw Configuration

Parameter Type Default Description
gstore_fdw.enabled bool on By adjustment of estimated cost value, it turns on/off Gstore_Fdw. Note that only Foreign Scan (Gstore_Fdw) can scan on GPU memory store, if GpuScan is not capable to run on.
gstore_fdw.auto_preload bool on Controls whether the GPU memory store shall be pre-loaded to GPU devices next to the PostgreSQL startup. If not pre-loaded, GPU memory store shall be loaded on the demand by someone's reference. It may lead slow-down of query response time on the first call.
gstore_fdw.default_base_dir text NULL Directory to create base files, if base_file was not specified in the foreign-table options. In the default, it shall be created on the default tablespace of the current database.
gstore_fdw.default_redo_dir text NULL Directory to create redo-log files, if redo_log_file was not specified in the foreign-table options. In the default, it shall be created on the default tablespace of the current database.

Configuration of GPU code generation and build

Parameter Type Default Description
pg_strom.program_cache_size int 256MB Amount of the shared memory size to cache GPU programs already built. It needs restart to update the parameter.
pg_strom.num_program_builders int 2 Number of background workers to build GPU programs asynchronously. It needs restart to update the parameter.
pg_strom.debug_jit_compile_options bool off Controls to include debug option (line-numbers and symbol information) on JIT compile of GPU programs. It is valuable for complicated bug analysis using GPU core dump, however, should not be enabled on daily use because of performance degradation.
pg_strom.extra_kernel_stack_size int 0 Extra size of stack, in bytes, for each GPU kernel thread to be allocated on execution. Usually, no need to change from the default value.

GPU Device Configuration

Parameter Type Default Description
pg_strom.cuda_visible_devices string '' List of GPU device numbers in comma separated, if you want to recognize particular GPUs on PostgreSQL startup. It is equivalent to the environment variable CUDAVISIBLE_DEVICES
pg_strom.gpu_memory_segment_size int 512MB Specifies the amount of device memory to be allocated per CUDA API call. Larger configuration will reduce the overhead of API calls, but not efficient usage of device memory.

System Shared Memory Configuration

Parameter Type Default Description
shmbuf.segment_size int 256MB
shmbuf.num_logical_segments int auto Default logical segment size is double size of system physical memory size.