This document describes the extended data types supported by YMatrix, which can be used to build wide/narrow table variant models.
YMatrix is a schema-based relational database. All tables must have a predefined schema defined via DDL statements. Each column in the schema represents an attribute value, such as name, height, or weight. However, in certain business scenarios, the following issues often arise:
To address these issues, multiple attributes can be serialized and stored in variable-length fields.
In such cases, extended data types are required—these allow multiple attributes to be stored within a single column.
There are many ways to implement extended data types. For example, custom serialization formats and encoding methods can be defined and stored in string-type columns. YMatrix provides the following extended data types:
Comparison:
| Data Type | Storage Format | Supported Storage Formats | Supported Data Formats |
|---|---|---|---|
| JSON | Text | All formats | INT/NUMERIC/BOOLEAN/NULL, etc. Supports storing different data types in arrays within a single JSON column |
| JSONB | Binary | All formats | INT/NUMERIC/BOOLEAN/NULL, etc. Supports storing different data types in arrays within a single JSON column |
| MXKV | Binary | All formats | MXKV_INT4/MXKV_FLOAT4/MXKV_FLOAT8/MXKV_TEXT Only supports flat key-value structure; no nesting or arrays allowed. Multiple columns are required to store metrics of different data types |
| MXKV2 | Binary | MARS2/3 | MXKV_INT8/MXKV_FLOAT8/MXKV_TEXT/MXKV_BOOL Only supports flat key-value structure; no nesting or arrays allowed. Multiple columns are required to store metrics of different data types |
The JSON and JSONB data types are inherited from PostgreSQL. Unlike storing JSON strings directly in text columns, JSON/JSONB types provide built-in functions for common JSON operations such as extracting key-values, merging, and deletion.
JSON and JSONB are used similarly and can be converted between each other. The differences are:
Below are common operations for JSON/JSONB types:
=# SELECT '{"a":1, "b":2, "c":3}'::json->'a';
?column?
----------
1
(1 row)
For nested JSON/JSONB structures, use -> multiple times:
=# SELECT '{"a":1, "b":2, "c":{"d":2}}'::json->'c'->'d';
?column?
----------
2
(1 row)
Use the - operator to delete a key. When performing write operations on JSON objects, the data must be cast to JSONB:
=# SELECT '{"a":1, "b":2, "c":3}'::jsonb - 'a';
?column?
------------------
{"b": 2, "c": 3}
(1 row)
Use the || operator to merge two JSONB objects:
=# SELECT '{"a":1, "b":2}'::jsonb || '{"c":3}'::jsonb;
?column?
--------------------------
{"a": 1, "b": 2, "c": 3}
(1 row)
When keys are the same, using || updates the value:
=# SELECT '{"a":1, "b":2}'::jsonb || '{"b":3}'::jsonb;
?column?
------------------
{"a": 1, "b": 3}
(1 row)
For more JSON operations, refer to the PostgreSQL documentation.
MXKV and MXKV2 are YMatrix-developed high-performance, scalable storage types with usage similar to JSON/JSONB. Their advantage lies in binary storage combined with YMatrix's encoding chain compression algorithms, enabling smaller storage footprint and higher query efficiency.
In addition to the differences listed in the comparison table above, MXKV and MXKV2 differ in the following ways:
Tutorial videos for this section are available in YMatrix Data Modeling and Spatiotemporal Distribution Models
Note!
This feature is available only in the Enterprise Edition.
The MXKV and MXKV2 custom data types are included in the MatrixTS extension. First, create the extension:
=# CREATE EXTENSION matrixts;
Verify that MXKV/MXKV2 is enabled:
// MXKV
=# SELECT '{}'::mxkv_text;
mxkv_text
-----------
{}
(1 row)
// MXKV2
=# SELECT '{}'::mxkv2_text;
mxkv2_text
-----------
{}
(1 row)
As shown, MXKV is enabled.
Create a new table and define a column of integer type to store integer-valued KV data:
// MXKV
=# CREATE TABLE data_1(
time timestamp with time zone,
tag_id int,
kv mxkv_int4
)
USING MARS3
DISTRIBUTED BY (tag_id)
ORDER BY (time,tag_id);
// MXKV2
=# CREATE TABLE data_2(
time timestamp with time zone,
tag_id int,
kv mxkv2_int8 encoding(compresstype=mxcustom,encodechain='auto')
)
USING MARS3
DISTRIBUTED BY (tag_id)
ORDER BY (time,tag_id);
Before using MXKV, keys must be defined to optimize compression and query performance. This step pre-defines the set of keys expected in the data.
MXKV provides the UDF mxkv_import_keys to define keys.
Two methods are available:
Provide a JSON-formatted KV sample. MXKV automatically extracts and defines the keys.
The following SQL extracts and defines keys 'a' and 'b' from the JSON:
=# SELECT mxkv_import_keys('{"a": 1, "b": 2}');
mxkv_import_keys
------------------
a
b
(2 rows)
Assume a table t_json exists with a JSON column c2:
=# SELECT c2 FROM t_json ;
c2
----------
{"k1":1}
{"k2":2}
(2 rows)
The result shows two keys: k1 and k2.
Use mxkv_import_keys to define keys from table data:
=# SELECT mxkv_import_keys('t_json'::regclass, 'c2');
mxkv_import_keys
------------------
k1
k2
(2 rows)
As shown, pass the table name and column name as parameters.
This method is more convenient and faster, but requires sample data to already exist in another table.
Note!
Keys in MXKV must be defined before data insertion. Otherwise, the following error occurs:
DETAIL: The key is not imported yet HINT: Import the keys with the mxkv_import_keys() function
After defining keys, insert KV data:
// MXKV
=# INSERT INTO data_1 VALUES(now(), 1, '{"a":1, "b":2}');
INSERT 0 1
// MXKV2
=# INSERT INTO data_2 VALUES(now(), 1, '{"a":1, "b":2}');
INSERT 0 1
As shown, the inserted KV data contains the previously defined keys 'a' and 'b'.
Reading key-values in MXKV uses the same -> operator as JSON:
// MXKV
=# SELECT time, tag_id, kv->'a' as a, kv->'b' as b FROM data_1;
time | tag_id | a | b
-------------------------------+--------+---+---
2023-10-19 10:20:05.334169+00 | 1 | 1 | 2
(1 row)
// MXKV2
=# SELECT time, tag_id, kv->'a' as a, kv->'b' as b FROM data_2;
time | tag_id | a | b
-------------------------------+--------+---+---
2023-10-19 10:20:05.334169+00 | 1 | 1 | 2
(1 row)
Extracted key-values can be used like regular columns—for comparisons or joins with other tables. For example, create a table data_1/data_2 with the same structure as data_1 and insert identical data. The following SQL performs a join:
// MXKV
=# SELECT * FROM data_1, data_3 WHERE data_1.kv->'a' = data_3.kv->'a';
time | tag_id | kv | time | tag_id | kv
-------------------------------+--------+------------------+-------------------------------+--------+------------------
2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2} | 2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2}
(1 row)
// MXKV2
=# SELECT * FROM data_2, data_3 WHERE data_2.kv->'a' = data_3.kv->'a';
time | tag_id | kv | time | tag_id | kv
-------------------------------+--------+------------------+-------------------------------+--------+------------------
2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2} | 2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2}
(1 row)
Note!
In MXKV/MXKV2,->and->>are equivalent.
| Data Type | Supported Data Formats |
|---|---|
| MXKV | MXKV_INT4: Stores 32-bit integers (INT4/INT) MXKV_FLOAT4: Stores 32-bit floating-point numbers (FLOAT4/REAL) MXKV_FLOAT8: Stores 64-bit floating-point numbers (FLOAT8/FLOAT/DOUBLE PRECISION) MXKV_TEXT: Stores TEXT strings |
| MXKV2 | MXKV2_INT8: Stores 64-bit integers (INT8/INT) MXKV2_FLOAT8: Stores 64-bit floating-point numbers (FLOAT8/FLOAT/DOUBLE PRECISION) MXKV2_TEXT: Stores TEXT strings MXKV2_BOOL: Stores BOOLEAN values |
MXKV_FLOAT4 and MXKV_FLOAT8 support specifying decimal scale. For example:
=# CREATE TABLE data_4(
time timestamp with time zone,
tag_id int,
kv mxkv_float4(2)
)
USING MARS3
DISTRIBUTED BY (tag_id)
ORDER BY (time,tag_id);
With this definition, digits beyond two decimal places are rounded.
Specifying scale converts the internal storage to integer format, improving compression and efficiency. However, the value range is reduced accordingly.
Note!
MXKV and MXKV2 support only flat key-value structures. Nesting and arrays are not allowed.
For more technical details, see the blog MXKV.