This chapter introduces the steps to install PG-Strom.

Checklist

  • Server Hardware
    • It requires generic x86_64 hardware that can run Linux operating system supported by CUDA Toolkit. We have no special requirement for CPU, storage and network devices.
    • note002:HW Validation List may help you to choose the hardware.
    • SSD-to-GPU Direct SQL Execution needs SSD devices which support NVMe specification, and to be installed under the same PCIe Root Complex where GPU is located on.
  • GPU Device
    • PG-Strom requires at least one GPU device on the system, which is supported by CUDA Toolkit, has computing capability 6.0 (Pascal generation) or later;
    • note001:GPU Availability Matrix shows more detailed information. Check this list for the support status of SSD-to-GPU Direct SQL Execution.
  • Operating System
    • PG-Strom requires Linux operating system for x86_64 architecture, and its distribution supported by CUDA Toolkit. Our recommendation is Red Hat Enterprise Linux or CentOS version 7.x series. - SSD-to-GPU Direct SQL Execution needs Red Hat Enterprise Linux or CentOS version 7.3 or later.
  • PostgreSQL
    • PG-Strom requires PostgreSQL version 9.6 or later. PostgreSQL v9.6 renew the custom-scan interface for CPU-parallel execution or GROUP BY planning, thus, it allows cooperation of custom-plans provides by extension modules.
  • CUDA Toolkit
    • PG-Strom requires CUDA Toolkit version 9.1 or later.
    • PG-Strom provides half-precision floating point type (float2), and it internally use half_t type of CUDA C, so we cannot build it with older CUDA Toolkit.

OS Installation

Choose a Linux distribution which is supported by CUDA Toolkit, then install the system according to the installation process of the distribution. NVIDIA DEVELOPER ZONE introduces the list of Linux distributions which are supported by CUDA Toolkit.

In case of Red Hat Enterprise Linux 7.x or CentOS 7.x series, choose "Minimal installation" as base environment, and also check the following add-ons.

  • Debugging Tools
  • Development Tools

Post OS Installation Configuration

Next to the OS installation, a few additionsl configurations are required to install GPU-drivers and NVMe-Strom driver on the later steps.

Setup EPEL Repository

Several software modules required by PG-Strom are distributed as a part of EPEL (Extra Packages for Enterprise Linux). You need to add a repository definition of EPEL packages for yum system to obtain these software.

One of the package we will get from EPEL repository is DKMS (Dynamic Kernel Module Support). It is a framework to build Linux kernel module for the running Linux kernel on demand; used for NVIDIA's GPU driver or NVMe-Strom which is a kernel module to support SSD-to-GPU Direct SQL Execution.

epel-release package provides the repository definition of EPEL. You can obtain this package from the public FTP site of Fedora Project. Downloads the epel-release-<distribution version>.noarch.rpm, and install the package. Once epel-release package gets installed, yum system configuration is updated to get software from the EPEL repository.

Tip

Walk down the directory: Packages --> e, from the above URL.

Install the epel-release package as follows.

$ sudo yum install https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-7-11.noarch.rpm
          :
================================================================================
 Package           Arch        Version     Repository                      Size
================================================================================
Installing:
 epel-release      noarch      7-11        /epel-release-7-11.noarch       24 k

Transaction Summary
================================================================================
Install  1 Package
          :
Installed:
  epel-release.noarch 0:7-11

Complete!

HeteroDB-SWDC Installation

PG-Strom and related packages are distributed from HeteroDB Software Distribution Center. You need to add a repository definition of HeteroDB-SWDC for you system to obtain these software.

heterodb-swdc package provides the repository definition of HeteroDB-SWDC. Access to the HeteroDB Software Distribution Center using Web browser, download the heterodb-swdc-1.0-1.el7.noarch.rpm on top of the file list, then install this package. Once heterodb-swdc package gets installed, yum system configuration is updated to get software from the HeteroDB-SWDC repository.

Install the heterodb-swdc package as follows.

$ sudo yum install https://heterodb.github.io/swdc/yum/rhel7-x86_64/heterodb-swdc-1.0-1.el7.noarch.rpm
          :
================================================================================
 Package         Arch     Version       Repository                         Size
================================================================================
Installing:
 heterodb-swdc   noarch   1.0-1.el7     /heterodb-swdc-1.0-1.el7.noarch   2.4 k

Transaction Summary
================================================================================
Install  1 Package
          :
Installed:
  heterodb-swdc.noarch 0:1.0-1.el7

Complete!

CUDA Toolkit Installation

This section introduces the installation of CUDA Toolkit. If you already installed the latest CUDA Toolkit, you can skip this section.

NVIDIA offers two approach to install CUDA Toolkit; one is by self-extracting archive (called runfile), and the other is by RPM packages. We recommend RPM installation because it allows simple software updates.

You can download the installation package for CUDA Toolkit from NVIDIA DEVELOPER ZONE. Choose your OS, architecture, distribution and version, then choose "rpm(network)" edition.

CUDA Toolkit download

The "rpm(network)" edition contains only yum repositoty definition to distribute CUDA Toolkit. It is similar to the EPEL repository definition at the OS installation. So, you needs to installa the related RPM packages over network after the resistoration of CUDA repository. Run the following command.

$ sudo rpm -i cuda-repo-<distribution>-<version>.x86_64.rpm
$ sudo yum clean all
$ sudo yum install cuda

Once installation completed successfully, CUDA Toolkit is deployed at /usr/local/cuda.

$ ls /usr/local/cuda
bin     include  libnsight         nvml       samples  tools
doc     jre      libnvvp           nvvm       share    version.txt
extras  lib64    nsightee_plugins  pkgconfig  src

Once installation gets completed, ensure the system recognizes the GPU devices correctly. nvidia-smi command shows GPU information installed on your system, as follows.

$ nvidia-smi
Wed Feb 14 09:43:48 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.26                 Driver Version: 387.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:02:00.0 Off |                    0 |
| N/A   41C    P0    37W / 250W |      0MiB / 16152MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Tip

If nouveau driver which conflicts to nvidia driver is loaded, system cannot load the nvidia driver immediately. In this case, reboot the operating system once, then confirm whether you can run nvidia-smi command successfully, or not. CUDA installer also disables nouveau driver, nouveau driver will not be loaded on the next boot.

PostgreSQL Installation

This section introduces PostgreSQL installation with RPM. We don't introduce the installation steps from the source because there are many documents for this approach, and there are also various options for the ./configure script.

PostgreSQL is also distributed in the packages of Linux distributions, however, it is not the latest one, and often older than the version which supports PG-Strom. For example, Red Hat Enterprise Linux 7.x or CentOS 7.x distributes PostgreSQL v9.2.x series. This version had been EOL by the PostgreSQL community.

PostgreSQL Global Development Group provides yum repository to distribute the latest PostgreSQL and related packages. Like the configuration of EPEL, you can install a small package to set up yum repository, then install PostgreSQL and related software.

Here is the list of yum repository definition: http://yum.postgresql.org/repopackages.php.

Repository definitions are per PostgreSQL major version and Linux distribution. You need to choose the one for your Linux distribution, and for PostgreSQL v9.6 or later.

All you need to install are yum repository definition, and PostgreSQL packages. If you choose PostgreSQL v10, the pakages below are required to install PG-Strom.

  • postgresql10-devel
  • postgresql10-server
$ sudo yum install -y https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/pgdg-redhat10-10-2.noarch.rpm
$ sudo yum install -y postgresql10-server postgresql10-devel
          :
================================================================================
 Package                  Arch        Version                 Repository   Size
================================================================================
Installing:
 postgresql10-devel       x86_64      10.2-1PGDG.rhel7        pgdg10      2.0 M
 postgresql10-server      x86_64      10.2-1PGDG.rhel7        pgdg10      4.4 M
Installing for dependencies:
 postgresql10             x86_64      10.2-1PGDG.rhel7        pgdg10      1.5 M
 postgresql10-libs        x86_64      10.2-1PGDG.rhel7        pgdg10      354 k

Transaction Summary
================================================================================
Install  2 Packages (+2 Dependent packages)
          :
Installed:
  postgresql10-devel.x86_64 0:10.2-1PGDG.rhel7
  postgresql10-server.x86_64 0:10.2-1PGDG.rhel7

Dependency Installed:
  postgresql10.x86_64 0:10.2-1PGDG.rhel7
  postgresql10-libs.x86_64 0:10.2-1PGDG.rhel7

Complete!

The RPM packages provided by PostgreSQL Global Development Group installs software under the /usr/pgsql-<version> directory, so you may pay attention whether the PATH environment variable is configured appropriately.

postgresql-alternative package set up symbolic links to the related commands under /usr/local/bin, so allows to simplify the operations. Also, it enables to switch target version using alternatives command even if multiple version of PostgreSQL.

$ sudo yum install postgresql-alternatives
          :
Resolving Dependencies
--> Running transaction check
---> Package postgresql-alternatives.noarch 0:1.0-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved
          :
================================================================================
 Package                      Arch        Version           Repository     Size
================================================================================
Installing:
 postgresql-alternatives      noarch      1.0-1.el7         heterodb      9.2 k

Transaction Summary
================================================================================
          :
Installed:
  postgresql-alternatives.noarch 0:1.0-1.el7

Complete!

PG-Strom Installation

RPM Installation

PG-Strom and related packages are distributed from HeteroDB Software Distribution Center. If you repository definition has been added, not many tasks are needed.

We provide individual RPM packages of PG-Strom for each base PostgreSQL version. pg_strom-PG96 package is built for PostgreSQL 9.6, and pg_strom-PG10 is also built for PostgreSQL v10.

$ sudo yum install pg_strom-PG10
          :
================================================================================
 Package              Arch          Version               Repository       Size
================================================================================
Installing:
 pg_strom-PG10        x86_64        1.9-180301.el7        heterodb        320 k

Transaction Summary
================================================================================
          :
Installed:
  pg_strom-PG10.x86_64 0:1.9-180301.el7

Complete!

That's all for package installation.

Installation from the source

For developers, we also introduces the steps to build and install PG-Strom from the source code.

Getting the source code

Like RPM packages, you can download tarball of the source code from HeteroDB Software Distribution Center. On the other hands, here is a certain time-lags to release the tarball, it may be preferable to checkout the master branch of PG-Strom on GitHub to use the latest development branch.

$ git clone https://github.com/heterodb/pg-strom.git
Cloning into 'pg-strom'...
remote: Counting objects: 13797, done.
remote: Compressing objects: 100% (215/215), done.
remote: Total 13797 (delta 208), reused 339 (delta 167), pack-reused 13400
Receiving objects: 100% (13797/13797), 11.81 MiB | 1.76 MiB/s, done.
Resolving deltas: 100% (10504/10504), done.

Building the PG-Strom

Configuration to build PG-Strom must match to the target PostgreSQL strictly. For example, if a particular strcut has inconsistent layout by the configuration at build, it may lead problematic bugs; not easy to find out. Thus, not to have inconsistency, PG-Strom does not have own configure script, but references the build configuration of PostgreSQL using pg_config command.

If PATH environment variable is set to the pg_config command of the target PostgreSQL, run make and make install. Elsewhere, give PG_CONFIG=... parameter on make command to tell the full path of the pg_config command.

$ cd pg-strom
$ make PG_CONFIG=/usr/pgsql-10/bin/pg_config
$ sudo make install PG_CONFIG=/usr/pgsql-10/bin/pg_config

Post Installation Setup

Creation of database cluster

Database cluster is not constructed yet, run initdb command to set up initial database of PostgreSQL.

The default path of the database cluster on RPM installation is /var/lib/pgsql/<version number>/data. If you install postgresql-alternatives package, this default path can be referenced by /var/lib/pgdata regardless of the PostgreSQL version.

$ sudo su - postgres
$ initdb -D /var/lib/pgdata/
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/pgdata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /var/lib/pgdata/ -l logfile start

Setup postgresql.conf

Next, edit postgresql.conf which is a configuration file of PostgreSQL. The parameters below should be edited at least to work PG-Strom. Investigate other parameters according to usage of the system and expected workloads.

  • shared_preload_libraries
    • PG-Strom module must be loaded on startup of the postmaster process by the shared_preload_libraries. Unable to load it on demand. Therefore, you must add the configuration below.
    • shared_preload_libraries = '$libdir/pg_strom'
  • max_worker_processes
    • PG-Strom internally uses several background workers, so the default configuration (= 8) is too small for other usage. So, we recommand to expand the variable for a certain margin.
    • max_worker_processes = 100
  • shared_buffers
    • Although it depends on the workloads, the initial configuration of shared_buffers is too small for the data size where PG-Strom tries to work, thus storage workloads restricts the entire performance, and may be unable to work GPU efficiently.
    • So, we recommend to expand the variable for a certain margin.
    • shared_buffers = 10GB
    • Please consider to apply SSD-to-GPU Direct SQL Execution to process larger than system's physical RAM size.
    • Please consider to apply Columnar Cache if you want to cache particular tables.
  • work_mem
    • Although it depends on the workloads, the initial configuration of work_mem is too small to choose the optimal query execution plan on analytic queries.
    • An typical example is, disk-based merge sort may be chosen instead of the in-memory quick-sorting.
    • So, we recommend to expand the variable for a certain margin.
    • work_mem = 1GB

Start PostgreSQL

Start PostgreSQL service.

If PG-Strom is set up appropriately, it writes out log message which shows PG-Strom recognized GPU devices. The example below recognized the Tesla V100(PCIe; 16GB edition) device.

# systemctl start postgresql-10
# systemctl status -l postgresql-10
* postgresql-10.service - PostgreSQL 10 database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql-10.service; disabled; vendor preset: disabled)
   Active: active (running) since Sat 2018-03-03 15:45:23 JST; 2min 21s ago
     Docs: https://www.postgresql.org/docs/10/static/
  Process: 24851 ExecStartPre=/usr/pgsql-10/bin/postgresql-10-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS)
 Main PID: 24858 (postmaster)
   CGroup: /system.slice/postgresql-10.service
           |-24858 /usr/pgsql-10/bin/postmaster -D /var/lib/pgsql/10/data/
           |-24890 postgres: logger process
           |-24892 postgres: bgworker: PG-Strom GPU memory keeper
           |-24896 postgres: checkpointer process
           |-24897 postgres: writer process
           |-24898 postgres: wal writer process
           |-24899 postgres: autovacuum launcher process
           |-24900 postgres: stats collector process
           |-24901 postgres: bgworker: PG-Strom ccache-builder2
           |-24902 postgres: bgworker: PG-Strom ccache-builder1
           `-24903 postgres: bgworker: logical replication launcher

Mar 03 15:45:19 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:19.195 JST [24858] HINT:  Run 'nvidia-cuda-mps-control -d', then start server process. Check 'man nvidia-cuda-mps-control' for more details.
Mar 03 15:45:20 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:20.509 JST [24858] LOG:  PG-Strom: GPU0 Tesla V100-PCIE-16GB (5120 CUDA cores; 1380MHz, L2 6144kB), RAM 15.78GB (4096bits, 856MHz), CC 7.0
Mar 03 15:45:20 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:20.510 JST [24858] LOG:  NVRTC - CUDA Runtime Compilation vertion 9.1
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.378 JST [24858] LOG:  listening on IPv6 address "::1", port 5432
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.378 JST [24858] LOG:  listening on IPv4 address "127.0.0.1", port 5432
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.442 JST [24858] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.492 JST [24858] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.527 JST [24858] LOG:  redirecting log output to logging collector process
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.527 JST [24858] HINT:  Future log output will appear in directory "log".
Mar 03 15:45:23 saba.heterodb.com systemd[1]: Started PostgreSQL 10 database server.

At the last, create database objects related to PG-Strom, like SQL functions. This steps are packaged using EXTENSION feature of PostgreSQL. So, all you needs to run is CREATE EXTENSION on the SQL command line.

Please note that this step is needed for each new database. If you want PG-Strom is pre-configured on new database creation, you can create PG-Strom extension on the template1 database, its configuration will be copied to the new database on CREATE DATABASE command.

$ psql postgres -U postgres
psql (10.2)
Type "help" for help.

postgres=# CREATE EXTENSION pg_strom ;
CREATE EXTENSION

That's all for the installation.

NVME-Strom module

This section also introduces NVME-Strom Linux kernel module which is closely cooperating with core features of PG-Strom like SSD-to-GPU Direct SQL Execution, even if it is an independent software module.

Getting the module and installation

Like other PG-Strom related modules, NVME-Strom is distributed at the (https://heterodb.github.io/swdc/)[HeteroDB Software Distribution Center] as a free software. In other words, it is not an open source software.

If your system already setup heterodb-swdc package, yum install command downloads the RPM file and install the nvme_strom package.

$ sudo yum install nvme_strom
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.cat.net
 * epel: ftp.iij.ad.jp
 * extras: mirrors.cat.net
 * ius: mirrors.kernel.org
 * updates: mirrors.cat.net
Resolving Dependencies
--> Running transaction check
---> Package nvme_strom.x86_64 0:1.3-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package             Arch            Version            Repository         Size
================================================================================
Installing:
 nvme_strom          x86_64          1.3-1.el7          heterodb          273 k

Transaction Summary
================================================================================
Install  1 Package

Total download size: 273 k
Installed size: 1.5 M
Is this ok [y/d/N]: y
Downloading packages:
No Presto metadata available for heterodb
nvme_strom-1.3-1.el7.x86_64.rpm                            | 273 kB   00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : nvme_strom-1.3-1.el7.x86_64                                  1/1
  :
<snip>
  :
DKMS: install completed.
  Verifying  : nvme_strom-1.3-1.el7.x86_64                                  1/1

Installed:
  nvme_strom.x86_64 0:1.3-1.el7

Complete!

License activation

License activation is needed to use all the features of NVME-Strom module, provided by HeteroDB,Inc. You can operate the system without license, but features below are restricted. - Striping support (md-raid0) at SSD-to-GPU Direct SQL Execution - Compression support at in-memory columnar cache

You can obtain a license file, like as a plain text below, from HeteroDB,Inc.

IAwPOAC44m8LPMoV7bMykhxM27LAVrktspcaMHki8pI1fXrxq0KzqPDK4LzAA9n26IRAr/4ymB6QJ3/JxZOfYTVsbWq66vEtTAIuZVmJ/I888zRATj1hoofh1WbIwd3/ix28Cy1v16KCgLrlqPsra6NJScMOOHnuYoWWmWe4ml+n6GVEIb7ChUJvZbEZSO/DiLXosFc0N+MD4JTEU/XsBUP9ufacpbosW/YG2nOib3mpvhkfn7RQy2T5CVQeuGjM9Taj7DN5xipqU/Q0hZaZKA8EsZwsB6b4c7usdmPILyIpuTrWnEbjJ6worOQWHA+nL87xkDL1XYGH6UVc291QPLwk=
----
VERSION:1
SERIAL_NR:HDB-TRIAL
ISSUED_AT:2018-08-16
EXPIRED_AT:2018-09-15
NR_GPUS:1
LICENSEE_ORG:Capybara Kingdom
LICENSEE_NAME:Herr.Wassershweine
LICENSEE_MAIL:capybara@examplecom

Copy the license file to /etc/heterodb.license, then restart PostgreSQL.

The startup log messages of PostgreSQL dumps the license information, and it tells us the license activation is successfully done.

$ pg_ctl restart
   :
LOG:  PG-Strom built for PostgreSQL 11
LOG:  PG-Strom: GPU0 Tesla V100-PCIE-16GB (5120 CUDA cores; 1380MHz, L2 6144kB), RAM 15.78GB (4096bits, 856MHz), CC 7.0
   :
   :
LOG:  HeteroDB License: { "version" : 1, "serial_nr" : "HDB-TRIAL", "issued_at" : "16-Aug-2018", "expired_at" : "15-Sep-2018", "nr_gpus" : 1, "licensee_org" : "Capybara Kingdom", "licensee_name" : "Herr.Wassershweine", "licensee_mail" : "capybara@examplecom" }
LOG:  listening on IPv6 address "::1", port 5432
LOG:  listening on IPv4 address "127.0.0.1", port 5432
LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
   :

Kernel module parameters

NVME-Strom Linux kernel module has some parameters.

Parameter Type Default Description
verbose int 0 Enables detailed debug output
stat_info int 1 Enables performance statistics
fast_ssd_mode int 0 Operating mode for fast NVME-SSD

Here is an extra explanation for fast_ssd_mode parameter.

When NVME-Strom Linux kernel module get a request for SSD-to-GPU direct data transfer, first of all, it checks whether the required data blocks are caches on page-caches of operating system. If fast_ssd_mode is 0, NVME-Strom once writes back page caches of the required data blocks to the userspace buffer of the caller, then indicates application to invoke normal host-->device data transfer by CUDA API. It is suitable for non-fast NVME-SSDs such as PCIe x4 grade.

On the other hands, SSD-to-GPU direct data transfer may be faster, if you use PCIe x8 grade fast NVME-SSD or use multiple SSDs in striping mode, than normal host-->device data transfer after the buffer copy. If fast_ssd_mode is not 0, NVME-Strom kicks SSD-to-GPU direct data transfer regardless of the page cache state.

However, it shall never kicks SSD-to-GPU direct data transfer if page cache is dirty.