PG-Strom v2.3 Release
PG-Strom Development Team (1-Apr-2020)
Overview
Major changes in PG-Strom v2.3 includes:
- GpuJoin supports parallel construction of inner buffer
- Arrow_Fdw now becomes writable; supports INSERT/TRUNCATE.
- pg2arrow command supports 'append' mode.
- mysql2arrow command was added.
Prerequisites
- PostgreSQL v10, v11, v12
- CUDA Toolkit 10.1 or later
- Linux distributions supported by CUDA Toolkit
- Intel x86 64bit architecture (x86_64)
- NVIDIA GPU CC 6.0 or later (Pascal or Volta)
New Features
- GpuJoin supports parallel construction of inner buffer
- The older version construct inner buffer of GpuJoin by the backend process only. This restriction leads a problem; parallel scan of partitioned table delays extremely.
- This version allows both of the backend and worker processes to construct inner buffer. In case when we scan a partitioned table, any processes that is assigned to a particular child table can start GpuJoin operations immediately.
- Refactoring of the partition-wise asymmetric GpuJoin
- By the refactoring of the partition-wise asymmetric GpuJoin, optimizer becomes to prefer multi-level GpuJoin in case when it offers cheaper execution cost.
- Arrow_Fdw becomes writable; INSERT/TRUNCATE supported
- Arrow_Fdw foreign table allows bulk-loading by
INSERT
and data elimination bypgstrom.arrow_fdw_truncate
.
- Arrow_Fdw foreign table allows bulk-loading by
- pg2arrow command supports 'append' mode.
- We added
--append
option forpg2arrow
command. As literal, it appends query results on existing Apache Arrow file. - Also,
-t table
option was added as an alias ofSELECT * FROM table
.
- We added
- mysql2arrow command was added.
- We added
mysql2arrow
command that connects to MySQL server, not PostgreSQL, and write out SQL query results as Apache Arrow files. - It has equivalent functionality to
pg2arrow
except for enum data type.mysql2arrow
saves enum values as flat Utf8 values without DictionaryBatch chunks.
- We added
- Regression test was added
- Several test cases were added according to the PostgreSQL regression test framework.
Significant bug fixes
- Revised cache invalidation logic for GPU device functions / types
- The older version had invalidated all the metadata cache entries of GPU device functions / type on execution of ALTER command. It was revised to invalidate the entries that are actually updated.
- Revised extreme performance degradation if GROUP BY has same grouping key twice or even number times.
- GpuPreAgg combined hash values of grouping key of GROUP BY using XOR. So, if case when same column appeared even number time, it always leads 0 for hash-index problematically. Now we add a randomization for better hash distribution.
- Potential infinite loop on GpuScan
- By uninitialized values, GpuScan potentially goes to infinite loop when SSD2GPU Direct SQL is available.
- Potential GPU kernel crash on GpuJoin
- By uninitialized values, GpuJoin potentially makes GPU kernel crash when 3 or more tables are joined.
Deprecated Features
- PostgreSQL v9.6 Support
- CustomScan API in PostgreSQL v9.6 lacks a few APIs to handle dynamic shared memory (DSM). It has been a problem to handle a common code for v10 or later. To avoid the problem, we dropped PostgreSQL v9.6 support in this version.
- PL/CUDA
- According to the usecase analytics, users prefer familiar programming language environment like Python, rather than own special environment.
- A combination of Arrow_Fdw's GPU export functionality and CuPy invocation at PL/Python is a successor of PL/CUDA, for in-database machine-learning / statistical analytics.
- Gstore_Fdw
- This feature is replaced by the writable Arrow_Fdw and its GPU export functionality.
- Largeobject export to/import from GPU
- According to the usecase analytics, we determined this feature is not needed.