PG-Strom v2.3 Release

PG-Strom Development Team (1-Apr-2020)


Major changes in PG-Strom v2.3 includes:

  • GpuJoin supports parallel construction of inner buffer
  • Arrow_Fdw now becomes writable; supports INSERT/TRUNCATE.
  • pg2arrow command supports 'append' mode.
  • mysql2arrow command was added.


  • PostgreSQL v10, v11, v12
  • CUDA Toolkit 10.1 or later
  • Linux distributions supported by CUDA Toolkit
  • Intel x86 64bit architecture (x86_64)
  • NVIDIA GPU CC 6.0 or later (Pascal or Volta)

New Features

  • GpuJoin supports parallel construction of inner buffer
    • The older version construct inner buffer of GpuJoin by the backend process only. This restriction leads a problem; parallel scan of partitioned table delays extremely.
    • This version allows both of the backend and worker processes to construct inner buffer. In case when we scan a partitioned table, any processes that is assigned to a particular child table can start GpuJoin operations immediately.
  • Refactoring of the partition-wise asymmetric GpuJoin
    • By the refactoring of the partition-wise asymmetric GpuJoin, optimizer becomes to prefer multi-level GpuJoin in case when it offers cheaper execution cost.
  • Arrow_Fdw becomes writable; INSERT/TRUNCATE supported
    • Arrow_Fdw foreign table allows bulk-loading by INSERT and data elimination by pgstrom.arrow_fdw_truncate.
  • pg2arrow command supports 'append' mode.
    • We added --append option for pg2arrow command. As literal, it appends query results on existing Apache Arrow file.
    • Also, -t table option was added as an alias of SELECT * FROM table.
  • mysql2arrow command was added.
    • We added mysql2arrow command that connects to MySQL server, not PostgreSQL, and write out SQL query results as Apache Arrow files.
    • It has equivalent functionality to pg2arrow except for enum data type. mysql2arrow saves enum values as flat Utf8 values without DictionaryBatch chunks.
  • Regression test was added
    • Several test cases were added according to the PostgreSQL regression test framework.

Significant bug fixes

  • Revised cache invalidation logic for GPU device functions / types
    • The older version had invalidated all the metadata cache entries of GPU device functions / type on execution of ALTER command. It was revised to invalidate the entries that are actually updated.
  • Revised extreme performance degradation if GROUP BY has same grouping key twice or even number times.
    • GpuPreAgg combined hash values of grouping key of GROUP BY using XOR. So, if case when same column appeared even number time, it always leads 0 for hash-index problematically. Now we add a randomization for better hash distribution.
  • Potential infinite loop on GpuScan
    • By uninitialized values, GpuScan potentially goes to infinite loop when SSD2GPU Direct SQL is available.
  • Potential GPU kernel crash on GpuJoin
    • By uninitialized values, GpuJoin potentially makes GPU kernel crash when 3 or more tables are joined.

Deprecated Features

  • PostgreSQL v9.6 Support
    • CustomScan API in PostgreSQL v9.6 lacks a few APIs to handle dynamic shared memory (DSM). It has been a problem to handle a common code for v10 or later. To avoid the problem, we dropped PostgreSQL v9.6 support in this version.
    • According to the usecase analytics, users prefer familiar programming language environment like Python, rather than own special environment.
    • A combination of Arrow_Fdw's GPU export functionality and CuPy invocation at PL/Python is a successor of PL/CUDA, for in-database machine-learning / statistical analytics.
  • Gstore_Fdw
    • This feature is replaced by the writable Arrow_Fdw and its GPU export functionality.
  • Largeobject export to/import from GPU
    • According to the usecase analytics, we determined this feature is not needed.