SQL Objects
This chapter introduces SQL objects additionally provided by PG-Strom.
System Information
pgstrom.device_info
System View- It shows properties of GPU devices installed for PG-Strom.
Below is schema definition of the view.
name | type | description |
---|---|---|
device_nr | int |
GPU device number |
aindex | int |
Attribute index |
attribute | text |
Attribute name |
value | text |
Attribute value |
There are various kind of GPU device properties, but depending on the CUDA driver version where system is running.
So, pgstrom.device_info
system view identifies the target property by GPU device number (device_nr
) and attribute index (aindex
).
Below is an example of pgstrom.device_info
system view.
postgres=# select * from pgstrom.device_info limit 10;
device_nr | aindex | attribute | value
-----------+--------+-------------------------------------+------------------------------------------
0 | 0 | GPU Device Name | NVIDIA Tesla V100-PCIE-16GB
0 | 1 | GPU Device Brand | TESLA
0 | 2 | GPU Device UUID | GPU-8ba149db-53d8-c5f3-0f55-97ce8cfadb28
0 | 3 | GPU Total RAM Size | 15.78GB
0 | 4 | GPU PCI Bar1 Size | 16.00GB
0 | 5 | Maximum number of threads per block | 1024
0 | 6 | Maximum block dimension X | 1024
0 | 7 | Maximum block dimension Y | 1024
0 | 8 | Maximum block dimension Z | 64
0 | 9 | Maximum grid dimension X | 2147483647
(10 rows)
Arrow_Fdw
fdw_handler pgstrom.arrow_fdw_handler()
- FDW handler function of Arrow_Fdw. Usually, users don't need to invoke this function.
void pgstrom.arrow_fdw_validator(text[], oid)
- FDW options validation function of Arrow_Fdw. Usually, users don't need to invoke this function.
void pgstrom.arrow_fdw_import_file(text,text,text = null)
- This function tries to import Apache Arrow file, and defines a new foreign table. Its first argument is name of the new foreign table, the second argument is path of the Apache Arrow file, and the optional third argument is the schema name.
- This function is similar to
IMPORT FOREIGN SCHEMA
statement, but allows to import Apache Arrow files that have wider fields than the limitation of number of columns in PostgreSQL (MaxTupleAttributeNumber
= 1664). So, we recommend to useIMPORT FOREIGN SCHEMA
statement for most cases. - The example below shows the steps to import an Apache Arrow file with 2000 of Int16 fields by the
pgstrom.arrow_fdw_import_file
. The result of\d mytest
shows this foreign table has 2000 fields. - Due to the internal data format of PostgreSQL, it is not possible to read all the columns at once, but possible to read a part of columns like the last example.
=# select pgstrom.arrow_fdw_import_file('mytest', '/tmp/wide2000.arrow');
arrow_fdw_import_file
-----------------------
(1 row)
=# \d
List of relations
Schema | Name | Type | Owner
--------+--------+---------------+--------
public | mytest | foreign table | kaigai
(1 row)
=# \d mytest
Foreign table "public.mytest"
Column | Type | Collation | Nullable | Default | FDW options
-----------+----------+-----------+----------+---------+-------------
object_id | integer | | | |
c000 | smallint | | | |
c001 | smallint | | | |
c002 | smallint | | | |
c003 | smallint | | | |
: : : : : :
c1997 | smallint | | | |
c1998 | smallint | | | |
c1999 | smallint | | | |
Server: arrow_fdw
FDW options: (file '/tmp/wide2000.arrow')
=# select * from mytest ;
ERROR: target lists can have at most 1664 entries
=# select c0010,c1234,c1999 from mytest limit 3;
c0010 | c1234 | c1999
-------+-------+-------
232 | 232 | 232
537 | 537 | 537
219 | 219 | 219
(3 rows)
GPU Cache
pgstrom.gpucache_info
System View- It shows the current status of GPU Cache.
Below is schema definition of the view.
name | type | description |
---|---|---|
database_oid |
oid |
Database OID where the table with GPU Cache belongs to. |
database_name |
text |
Database name where the table with GPU Cache belongs to. |
table_oid |
oid |
Table OID that has GPU Cache. Note that it may not be in the current database. |
table_name |
text |
Table name that has GPU Cache. Note that it may not be in the current database. |
signature |
int8 |
An identifier hash value of GPU Cache. It may be changed after ALTER TABLE for example. |
refcnt |
int4 |
Reference counter of the GPU cache. It does not always reflect the latest status. |
corrupted |
bool |
It shows whether GPU cache is corrupted, or not. |
gpu_main_sz |
int8 |
Size of the fixed-length values area on the GPU Cache. |
gpu_extra_sz |
int8 |
Size of the variable-length values area on the GPU Cache |
redo_write_ts |
timestamptz |
Last update timestamp on the REDO Log buffer |
redo_write_nitems |
int8 |
Total number of REDO Log entries written to the REDO Log buffer. |
redo_write_pos |
int8 |
Total bytes of REDO Log entries written to the REDO Log buffer. |
redo_read_nitems |
int8 |
Total number of REDO Log entries read from REDO Log buffer, and already applied to. |
redo_read_pos |
int8 |
Total bytes of REDO Log entries read from REDO Log buffer, and already applied to. |
redo_sync_pos |
int8 |
The latest position on the REDO Log buffer, where it is already required the background worker to synchronize onto the GPU Cache. When free space of REDO Log buffer becomes tight, it is internally used to avoid flood of simultaneous asynchronized requests by many sessions. |
config_options |
text |
Options string of the GPU Cache |
Below is an example of pgstrom.gpucache_info
system view.
=# select * from pgstrom.gpucache_info ;
database_oid | database_name | table_oid | table_name | signature | gpu_main_sz | gpu_extra_sz | redo_write_ts | redo_write_nitems | redo_write_pos | redo_read_nitems | redo_read_pos | redo_sync_pos |
config_options
--------------+---------------+-----------+------------+------------+-------------+--------------+----------------------------+-------------------+----------------+------------------+---------------+---------------+------------------------------------------------------------------------------------------------------------------------
12728 | postgres | 25244 | mytest | 6295279771 | 675028992 | 0 | 2021-05-14 03:00:18.623503 | 500000 | 36000000 | 500000 | 36000000 | 36000000 | gpu_device_id=0,max_num_rows=10485760,redo_buffer_size=167772160,gpu_sync_interval=5000000,gpu_sync_threshold=41943040
12728 | postgres | 25262 | dpoints | 5985886065 | 772505600 | 0 | 2021-05-14 03:00:18.524627 | 8000000 | 576000192 | 8000000 | 576000192 | 576000192 | gpu_device_id=0,max_num_rows=12000000,redo_buffer_size=167772160,gpu_sync_interval=5000000,gpu_sync_threshold=41943040
(2 rows)
trigger pgstrom.gpucache_sync_trigger()
- A trigger function to synchronize GPU Cache on table updates. See GPU Cache chapter for more details.
bigint pgstrom.gpucache_apply_redo(regclass)
- If the given table has GPU Cache configured, it forcibly applies the REDO log entries onto the GPU Cache.
bigint pgstrom.gpucache_compaction(regclass)
- If the given table has GPU Cache configured, it forcibly run compaction of the variable-length data buffer.
bigint pgstrom.gpucache_recovery(regclass)
- It tries to recover the corrupted GPU cache.
HyperLogLog Functions
bigint pg_catalog.hll_count(TYPE)
- An aggregate function to estimate cardinarity of the key value, using HyperLogLog algorithm.
TYPE
is any ofint1
,int2
,int4
,int8
,numeric
,date
,time
,timetz
,timestamp
,timestamptz
,bpchar
,text
, oruuid
.- See HyperLogLog for more details of HyperLogLog functionality of PG-Strom.
bytea pg_catalog.hll_sketch(TYPE)
- An aggregate function to build HLL Sketch, used for HyperLogLog algorithm, then return as
bytea
datum. TYPE
is any ofint1
,int2
,int4
,int8
,numeric
,date
,time
,timetz
,timestamp
,timestamptz
,bpchar
,text
, oruuid
.bigint pg_catalog.hll_merge(bytea)
- An aggregate function that estimate cardinarity of the key values that are the source of the supplied HLL Sketch.
bytea pg_catalog.hll_combine(bytea)
- An aggregate function that combines multiple HLL Sketches, then returns a consolicated HLL Sketch. It is expected to transform HLL Sketch of weekly data to monthly data, for example.
int4[] pg_catalog.hll_sketch_histogram(bytea)
- A function to generate a histogram based on the register values of the supplied HLL Sketch. This is not an aggregate function. It expects to visualize the contents of HLL Sketch generated by
hll_sketch()
and so on.
Test Data Generator
void pgstrom.random_setseed(int)
- It initializes the random seed.
bigint pgstrom.random_int(float=0.0, bigint=0, bigint=INT_MAX)
- It generates random data in
bigint
type within the range. float pgstrom.random_float(float=0.0, float=0.0, float=1.0)
- It generates random data in
float
type within the range. date pgstrom.random_date(float=0.0, date='2015-01-01', date='2025-12-31')
- It generates random data in
date
type within the range. time pgstrom.random_time(float=0.0, time='00:00:00', time='23:59:59')
- It generates random data in
time
type within the range. timetz pgstrom.random_timetz(float=0.0, time='00:00:00', time='23:59:59')
- It generates random data in
timetz
type within the range. timestamp pgstrom.random_timestamp(float=0.0, timestamp='2015-01-01', timestamp='2025-01-01')
- It generates random data in
timestamp
type within the range. macaddr pgstrom.random_macaddr(float=0.0, macaddr='ab:cd:00:00:00', macaddr='ab:cd:ff:ff:ff:ff')
- It generates random data in
macaddr
type within the range. inet pgstrom.random_inet(float=0.0, inet='192.168.0.1/16')
- It generates random data in
inet
type within the range. text pgstrom.random_text(float=0.0, text='test_**')
- It generates random data in
text
type. The '*' characters in 2nd argument shall be replaced randomly. text pgstrom.random_text_len(float=0.0, int=10)
- It generates random data in
text
type within the specified length. int4range pgstrom.random_int4range(float=0.0, bigint=0, bigint=INT_MAX)
- It generates random data in
int4range
type within the range. int8range pgstrom.random_int8range(float=0.0, bigint=0, bigint=LONG_MAX)
- It generates random data in
int8range
type within the range. tsrange pgstrom.random_tsrange(float=0.0, timestamp='2015-01-01', timestamp='2025-01-01')
- It generates random data in
tsrange
type within the range. tstzrange pgstrom.random_tstzrange(float=0.0, timestamptz='2015-01-01', timestamptz='2025-01-01')
- It generates random data in
tstzrange
type within the range. daterange pgstrom.random_daterange(float=0.0, date='2015-01-01', date='2025-12-31')
- It generates random data in
daterange
type within the range.
Other Functions
pgstrom.license_query
- It shows the active commercial subscription, if loaded.
=# select pgstrom.license_query();
license_query
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{ "version" : 2, "serial_nr" : "HDB-TRIAL", "issued_at" : "2020-11-24", "expired_at" : "2025-12-31", "gpus" : [ { "uuid" : "GPU-8ba149db-53d8-c5f3-0f55-97ce8cfadb28" } ]}
(1 row)