This chapter introduces the overview of PG-Strom, and developer's community.
What is PG-Strom?
PG-Strom is an extension module of PostgreSQL designed for version 11 or later. By utilization of GPU (Graphic Processor Unit) device which has thousands cores per chip, it enables to accelerate SQL workloads for data analytics or batch processing to big data set.
Its core features are GPU code generator that automatically generates GPU program according to the SQL commands and asynchronous parallel execution engine to run SQL workloads on GPU device. The latest version supports SCAN (evaluation of WHERE-clause), JOIN and GROUP BY workloads. In the case when GPU-processing has advantage, PG-Strom replaces the vanilla implementation of PostgreSQL and transparentlly works from users and applications.
PG-Strom has two storage options. The first one is the heap storage system of PostgreSQL. It is not always optimal for aggregation / analysis workloads because of its row data format, on the other hands, it has an advantage to run aggregation workloads without data transfer from the transactional database. The other one is Apache Arrow files, that have structured columnar format. Even though it is not suitable for update per row basis, it enables to import large amount of data efficiently, and efficiently search / aggregate the data through foreign data wrapper (FDW).
One of the characteristic feature of PG-Strom is GPUDirect SQL that bypasses the CPU/RAM to read the data from NVME / NVME-oF to the GPU directly. SQL processing on the GPU maximizes the bandwidth of these devices. PG-Strom v3.0 newly supports NVIDIA GPUDirect Storage, it allows to support SDS (Software Defined Storage) over the NVME-oF protocol and shared filesystems.
Also, the v3.0 newly supports execution of some PostGIS function and GiST index search on the GPU side. Along with the GPU cache, that duplicates the table contents often updated very frequently, it enables search / analysis processing based on the real-time locational information.
License and Copyright
PG-Strom is an open source software distributed under the PostgreSQL License. See LICENSE for the license details.
Please post your questions, requests and trouble reports to the Discussion of GitHubの.
Please pay attention it is a public board for world wide. So, it is your own responsibility not to disclose confidential information.
The primary language of the discussion board is English. On the other hands, we know major portion of PG-Strom users are Japanese because of its development history, so we admit to have a discussion on the list in Japanese language. In this case, please don't forget to attach
(JP) prefix on the subject like, for non-Japanese speakers to skip messages.
Bug or troubles report
If you got troubles like incorrect results, system crash / lockup, or something strange behavior, please open a new issue with bug tag at the PG-Strom Issue Tracker.
Please ensure the items below on bug reports.
- Whether you can reproduce the same problem on the latest revision?
- Hopefully, we recommend to test on the latest OS, CUDA, PostgreSQL and related software.
- Whether you can reproduce the same problem if PG-Strom is disabled?
- GUC option pg_strom.enabled can turn on/off PG-Strom.
- Is there any known issues on the issue tracker of GitHub?
- Please don't forget to search closed issues
The information below are helpful for bug-reports.
- Output of
EXPLAIN VERBOSEfor the queries in trouble.
- Data structure of the tables involved with
\d+ <table name>on psql command.
- Log messages (verbose messages are more helpful)
- Status of GUC options you modified from the default configurations.
- Hardware configuration - GPU model and host RAM size especially.
If you are not certain whether the strange behavior on your site is bug or not, please report it to the discussion board prior to the open a new issue ticket. Developers may be able to suggest you next action - like a request for extra information.
New features proposition
If you have any ideas of new features, please open a new issue with feature tag at the PG-Strom Issue Tracker, then have a discussion with other developers.
A preferable design proposal will contain the items below.
- What is your problem to solve / improve?
- How much serious is it on your workloads / user case?
- Way to implement your idea?
- Expected downside, if any.
Once we could make a consensus about its necessity, coordinator will attach accepted tag and the issue ticket is used to track rest of the development. Elsewhere, the issue ticket got rejected tag and closed.
Once a proposal got rejected, we may have different decision in the future. If comprehensive circumstance would be changed, you don't need to hesitate revised proposition again.
On the development stage, please attach patch file on the issue ticket. We don't use pull request.
The PG-Strom development team will support the latest release which are distributed from the HeteroDB Software Distribution Center only. So, people who met troubles needs to ensure the problems can be reproduced with the latest release.
Please note that it is volunteer based community support policy, so our support is best effort and no SLA definition.
If you need commercial support, contact to HeteroDB,Inc (contact@heterodbcom).