Data Types

PG-Strom support the following data types for use on GPU device.

Numeric types

int1 [length: 1byte]
8bit integer data type; enhanced data type by PG-Strom
int2 (alias smallint) [length: 2bytes]
16bit integer data type
int4 (alias int) [length: 4bytes]
32bit integer data type
int8 (alias bigint) [length: 8bytes]
64bit integer data type
float2 [length: 2bytes]
Half precision data type; enhanced data type by PG-Strom

Note

Even though GPU supports half-precision floating-point numbers by hardware, CPU (x86_64 processor) does not support it yet. So, when CPU processes float2 data types, it transform them to float or double on calculations. So, CPU has no advantages for calculation performance of float2, unlike GPU. It is a feature to save storage/memory capacity for machine-learning / statistical-analytics.

float4 (alias real) [length: 4bytes]
Single precision floating-point data type
float8 (alias double precision) [length: 8bytes]
Double precision floating-point data type
numeric [length: variable]
Real number data type; handled as a 128bit fixed-point value in GPU

Note

When GPU processes values in numeric data type, it is converted to an internal 128bit fixed-point number because of implementation reason. (This layout is identical to Decimal type in Apache Arrow.) It is transparently converted to/from the internal format, on the other hands, PG-Strom cannot convert numaric datum with large number of digits, so tries to fallback operations by CPU. Therefore, it may lead slowdown if numeric data with large number of digits are supplied to GPU device. To avoid the problem, turn off the GUC option pg_strom.enable_numeric_type not to run operational expression including numeric data types on GPU devices.

Date and time types

date [length: 4bytes]
Date data type
time (alias time without time zone) [length: 8bytes]
Time data type
timetz (alias time with time zone) [length: 12bytes]
Time with timezone data type
timestamp (alias timestamp without time zone) [length: 8bytes]
Timestamp data type
timestamptz (alias timestamp with time zone) [length: 8bytes]
Timestamp with timezone data type
interval [length: 16bytes]
Interval data type

Variable length types

bpchar [length: variable]
variable length text with whitespace paddings
varchar [length: variable]
variable length text type
text [length: variable]
variable length text type
bytea [length: variable]
variable length binary type

unstructured data types

jsonb [length: variable]
JSON data type with binary indexed keys

Note

Pay attention for the two points below, when GPU processes jsonb data types. jsonb is not performance efficient data types because it has to load unreferenced attributes onto GPU from the storage, so tend to consume I/O bandwidth by junk data. In case when jsonb data length exceeds the threshold of datum TOASTen, entire jsonb value is written out to TOAST table, thus, GPU cannot process these values and invokes inefficient CPU-fallback operations. Regarding to the 2nd problem, you can extend table's storage option toast_tuple_target to enlarge the threshold for datum TOASTen.

Miscellaneous types

boolean [length: 1byte]
Boolean data type
money [length: 8bytes]
Money data type
uuid [length: 16bytes]
UUID data type
macaddr [length: 6bytes]
Network MAC address data type
inet [length: 7 or 19bytes]
Network address data type
cidr [length: 7 or 19butes]
Network address data type
cube [length: variable]
Extra data type provided by contrib/cube

Geometry data types

geometry [length: variable]
Geometry object of PostGIS
box2df [length: 16bytes]
2-dimension bounding box (used to GiST-index)