SQL Objects

This chapter introduces SQL objects additionally provided by PG-Strom.

System Information

pgstrom.device_info System View

It shows properties of GPU devices installed for PG-Strom.

Below is schema definition of the view.

name type description
gpu_id int GPU device number
att_name text Attribute name
att_value text Attribute value
att_desc text Attribute description

There are various kind of GPU device properties, but depending on the CUDA driver version where system is running. So, pgstrom.device_info system view identifies the target property by GPU device identifier (gpu_id) and attribute name (att_name).

Below is an example of pgstrom.device_info system view.

postgres=# select * from pgstrom.gpu_device_info limit 10;
 gpu_id |       att_name        |                att_value                 |              att_desc
--------+-----------------------+------------------------------------------+-------------------------------------
      0 | DEV_NAME              | NVIDIA A100-PCIE-40GB                    | GPU Device Name
      0 | DEV_ID                | 0                                        | GPU Device ID
      0 | DEV_UUID              | GPU-13943bfd-5b30-38f5-0473-78979c134606 | GPU Device UUID
      0 | DEV_TOTAL_MEMSZ       | 39.39GB                                  | GPU Total RAM Size
      0 | DEV_BAR1_MEMSZ        | 64.00GB                                  | GPU PCI Bar1 Size
      0 | NUMA_NODE_ID          | -1                                       | GPU NUMA Node Id
      0 | MAX_THREADS_PER_BLOCK | 1024                                     | Maximum number of threads per block
      0 | MAX_BLOCK_DIM_X       | 1024                                     | Maximum block dimension X
      0 | MAX_BLOCK_DIM_Y       | 1024                                     | Maximum block dimension Y
      0 | MAX_BLOCK_DIM_Z       | 64                                       | Maximum block dimension Z
(10 rows)

Arrow_Fdw

fdw_handler pgstrom.arrow_fdw_handler()
FDW handler function of Arrow_Fdw. Usually, users don't need to invoke this function.
void pgstrom.arrow_fdw_validator(text[], oid)
FDW options validation function of Arrow_Fdw. Usually, users don't need to invoke this function.
event_trigger pgstrom.arrow_fdw_precheck_schema()
Event trigger function to validate schema definition of Arrow files. Usually, users don't need to invoke this function.
void pgstrom.arrow_fdw_import_file(text, text, text = null)

This function tries to import Apache Arrow file, and defines a new foreign table. Its first argument is name of the new foreign table, the second argument is path of the Apache Arrow file, and the optional third argument is the schema name.

This function is similar to IMPORT FOREIGN SCHEMA statement, but allows to import Apache Arrow files that have wider fields than the limitation of number of columns in PostgreSQL (MaxTupleAttributeNumber = 1664). So, we recommend to use IMPORT FOREIGN SCHEMA statement for most cases.

The example below shows the steps to import an Apache Arrow file with 2000 of Int16 fields by the pgstrom.arrow_fdw_import_file. The result of \d mytest shows this foreign table has 2000 fields.

Due to the internal data format of PostgreSQL, it is not possible to read all the columns at once, but possible to read a part of columns like the last example.

=# select pgstrom.arrow_fdw_import_file('mytest', '/tmp/wide2000.arrow');
 arrow_fdw_import_file
-----------------------

(1 row)

=# \d
            List of relations
 Schema |  Name  |     Type      | Owner
--------+--------+---------------+--------
 public | mytest | foreign table | kaigai
(1 row)


=# \d mytest
                    Foreign table "public.mytest"
  Column   |   Type   | Collation | Nullable | Default | FDW options
-----------+----------+-----------+----------+---------+-------------
 object_id | integer  |           |          |         |
 c000      | smallint |           |          |         |
 c001      | smallint |           |          |         |
 c002      | smallint |           |          |         |
 c003      | smallint |           |          |         |
   :             :          :          :          :            :
 c1997     | smallint |           |          |         |
 c1998     | smallint |           |          |         |
 c1999     | smallint |           |          |         |
Server: arrow_fdw
FDW options: (file '/tmp/wide2000.arrow')

=# select * from mytest ;
ERROR:  target lists can have at most 1664 entries

=# select c0010,c1234,c1999 from mytest limit 3;
 c0010 | c1234 | c1999
-------+-------+-------
   232 |   232 |   232
   537 |   537 |   537
   219 |   219 |   219
(3 rows)

GPU Cache

pgstrom.gpucache_info System View
It shows the current status of GPU Cache.
Below is schema definition of the view.
name type description
database_oid oid Database OID where the table with GPU Cache belongs to.
database_name text Database name where the table with GPU Cache belongs to.
table_oid oid Table OID that has GPU Cache. Note that it may not be in the current database.
table_name text Table name that has GPU Cache. Note that it may not be in the current database.
signature int8 An identifier hash value of GPU Cache. It may be changed after ALTER TABLE for example.
phase text Phase of GPU cache construction: either not_built, is_empty, is_loading, is_ready, or corrupted
rowid_num_used int8 Number of allocated row-id
rowid_num_free int8 Number of free row-id
gpu_main_sz int8 Size of the fixed-length values area on the GPU Cache.
gpu_main_nitems int8 Number of tuples on the GPU Cache.
gpu_extra_sz int8 Size of the variable-length values area on the GPU Cache.
gpu_extra_usage int8 Size of the used variable-length values area on the GPU Cache.
gpu_extra_dead int8 Size of the free variable-length values area on the GPU Cache.
redo_write_ts timestamptz Last update timestamp on the REDO Log buffer
redo_write_nitems int8 Total number of REDO Log entries written to the REDO Log buffer.
redo_write_pos int8 Total bytes of REDO Log entries written to the REDO Log buffer.
redo_read_nitems int8 Total number of REDO Log entries read from REDO Log buffer, and already applied to.
redo_read_pos int8 Total bytes of REDO Log entries read from REDO Log buffer, and already applied to.
redo_sync_pos int8 The latest position on the REDO Log buffer, where it is already required the background worker to synchronize onto the GPU Cache. When free space of REDO Log buffer becomes tight, it is internally used to avoid flood of simultaneous asynchronized requests by many sessions.
config_options text Options string of the GPU Cache

Below is an example of pgstrom.gpucache_info system view.

=# select * from pgstrom.gpucache_info ;
 database_oid | database_name | table_oid |    table_name    | signature  |  phase   | rowid_num_used | rowid_num_free | gpu_main_sz | gpu_main_nitems | gpu_extra_sz | gpu_extra_usage | gpu_extra_dead |         redo_write_ts         | redo_write_nitems | redo_write_pos | redo_read_nitems | redo_read_pos |
redo_sync_pos |                                                   config_options
--------------+---------------+-----------+------------------+------------+----------+----------------+----------------+-------------+-----------------+--------------+-----------------+----------------+-------------------------------+-------------------+----------------+------------------+---------------+---------------+---------------------------------------------------------------------------------------------------------------------
       193450 | hoge          |    603029 | cache_test_table | 4529357070 | is_ready |           4000 |           6000 |      439904 |            4000 |      3200024 |          473848 |              0 | 2023-12-18 01:25:42.850193+09 |              4000 |         603368 |             4000 |        603368 |       603368 | gpu_device_id=0,max_num_rows=10000,redo_buffer_size=157286400,gpu_sync_interval=4000000,gpu_sync_threshold=10485760
(1 row)
trigger pgstrom.gpucache_sync_trigger()
A trigger function to synchronize GPU Cache on table updates. See GPU Cache chapter for more details.
bigint pgstrom.gpucache_apply_redo(regclass)
If the given table has GPU Cache configured, it forcibly applies the REDO log entries onto the GPU Cache.
bigint pgstrom.gpucache_compaction(regclass)
If the given table has GPU Cache configured, it forcibly run compaction of the variable-length data buffer.
bigint pgstrom.gpucache_recovery(regclass)
It tries to recover the corrupted GPU cache.

Test Data Generator

void pgstrom.random_setseed(int)
It initializes the random seed.
bigint pgstrom.random_int(float=0.0, bigint=0, bigint=INT_MAX)
It generates random data in bigint type within the range.
float pgstrom.random_float(float=0.0, float=0.0, float=1.0)
It generates random data in float type within the range.
date pgstrom.random_date(float=0.0, date='2015-01-01', date='2025-12-31')
It generates random data in date type within the range.
time pgstrom.random_time(float=0.0, time='00:00:00', time='23:59:59')
It generates random data in time type within the range.
timetz pgstrom.random_timetz(float=0.0, time='00:00:00', time='23:59:59')
It generates random data in timetz type within the range.
timestamp pgstrom.random_timestamp(float=0.0, timestamp='2015-01-01', timestamp='2025-01-01')
It generates random data in timestamp type within the range.
macaddr pgstrom.random_macaddr(float=0.0, macaddr='ab:cd:00:00:00', macaddr='ab:cd:ff:ff:ff:ff')
It generates random data in macaddr type within the range.
inet pgstrom.random_inet(float=0.0, inet='192.168.0.1/16')
It generates random data in inet type within the range.
text pgstrom.random_text(float=0.0, text='test_**')
It generates random data in text type. The '*' characters in 2nd argument shall be replaced randomly.
text pgstrom.random_text_len(float=0.0, int=10)
It generates random data in text type within the specified length.
int4range pgstrom.random_int4range(float=0.0, bigint=0, bigint=INT_MAX)
It generates random data in int4range type within the range.}
int8range pgstrom.random_int8range(float=0.0, bigint=0, bigint=LONG_MAX)
It generates random data in int8range type within the range.
tsrange pgstrom.random_tsrange(float=0.0, timestamp='2015-01-01', timestamp='2025-01-01')
It generates random data in tsrange type within the range.
tstzrange pgstrom.random_tstzrange(float=0.0, timestamptz='2015-01-01', timestamptz='2025-01-01')
It generates random data in tstzrange type within the range.
daterange pgstrom.random_daterange(float=0.0, date='2015-01-01', date='2025-12-31')
It generates random data in daterange type within the range.

Other Functions

text pgstrom.githash()
It displays the hash value of the source code revision from the currently loaded PG-Strom module is based. This value is useful in determining the software revision in the event of a failure.
postgres=# select pgstrom.githash();
                 githash
------------------------------------------
 103984be24cafd1e7ce6330a050960d97675c196
text pgstrom.license_query()
It displays the active commercial subscription, if loaded.
=# select pgstrom.license_query();
                                                                               license_query
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 { "version" : 2, "serial_nr" : "HDB-TRIAL", "issued_at" : "2020-11-24", "expired_at" : "2025-12-31", "gpus" : [ { "uuid" : "GPU-8ba149db-53d8-c5f3-0f55-97ce8cfadb28" } ]}
(1 row)