This document mainly introduces the extended data types supported by YMatrix, which can be used to build [Wide table/Narrow table variant model] (/doc/5.2/datamodel/guidebook).
YMatrix is a schema relational database, and all tables need to predefined a clear schema, namely Schema, through DDL statements. Each column in the pattern represents the value of an attribute, such as: name, height, weight. However, for some business scenarios, the following situations often occur:
Serialize multiple columns and save them in variable length field
At this time, you need to use an extended data type: it supports storing multiple attributes in a single column.
There are very diverse ways to implement extended data types, such as customizing serialized storage formats and encoding methods, and then storing them into string types, etc. YMatrix provides the following extended data types:
For example:
| Data Type | Storage Method | Supported Storage Format | Supported Data Format |
|--|--|--|--|--|
| JSON | Text | All formats | INT/NUMBERIC/BOOLEAN/NULL, etc.
Support different data formats to store in an array into a JSON column |
| JSONB | Binary Number | All Formats | INT/NUMBERIC/BOOLEAN/NULL, etc.
Support different data formats to store in an array into a JSON column |
| MXKV | Binary Number | All Formats | MXKV_INT4/MXKV_FLOAT4/MXKV_FLOAT8/MXKV_TEXT
Only supports first-level KV structures, nesting and arrays are not allowed. If you want to store metrics in different data formats, you need to create multiple columns |
| MXKV2 | Binary number | MARS2/3 | MXKV_INT8/MXKV_FLOAT8/MXKV_TEXT/MXKV_BOOL
Only supports first-level KV structures, nesting and arrays are not allowed. If you want to store metrics in different data formats, you need to create multiple columns |
The JSON/JSONB type inherits from the PostgreSQL database. Unlike directly saving JSON strings with string-type columns, the JSON/JSONB type provides commonly used JSON operation functions for reading key-values, as well as performing operations such as merging and deleting.
The usage methods of JSON and JSONB are basically the same and can be directly converted to each other. The difference is:
The following are the commonly used JSON/JSONB type operation methods:
=# SELECT '{"a":1, "b":2, "c":3}'::json->'a';
?column?
----------
1
(1 row)
If the JSON/JSONB hierarchy is nested, use -> multiple times:
=# SELECT '{"a":1, "b":2, "c":{"d":2}}'::json->'c'->'d';
?column?
----------
2
(1 row)
Use the - sign to delete key values. When writing to a JSON object, it must be converted to the JSONB type:
=# SELECT '{"a":1, "b":2, "c":3}'::jsonb - 'a';
?column?
------------------
{"b": 2, "c": 3}
(1 row)
Use the || operator to merge two JSONB type keys and values together:
=# SELECT '{"a":1, "b":2}'::jsonb || '{"c":3}'::jsonb;
?column?
------------------------------
{"a": 1, "b": 2, "c": 3}
(1 row)
When the key values are the same, use the || operator to update the value:
=# SELECT '{"a":1, "b":2}'::jsonb || '{"b":3}'::jsonb;
?column?
------------------
{"a": 1, "b": 3}
(1 row)
For more JSON operation methods, please refer to PostgreSQL Document.
MXKV and MXKV2 are efficient and scalable storage types developed by YMatrix, and their usage methods are similar to JSON/JSONB. Its advantage is that it uses binary storage and combines with the coding chain compression algorithm in YMatrix, which can achieve smaller storage space and higher query efficiency.
In addition to the items listed in the comparison table of the first 4 data types in this document, there are the following differences between the two:
This part of the teaching video is included in YMatrix Data Modeling and Spatial-temporal Distribution Model
Notes!
This feature is only available in the Enterprise Edition.
MXKV/MXKV2 Custom data types are included in the MatrixTS extension, and you must first create the extension:
=# CREATE EXTENSION matrixts;
Verify that MXKV/MXKV2 is enabled:
// MXKV
=# SELECT '{}'::mxkv_text;
mxkv_text
--------------
{}
(1 row)
// MXKV2
=# SELECT '{}'::mxkv2_text;
mxkv2_text
--------------
{}
(1 row)
As shown above, MXKV is turned on.
Create a new table and define the integer value data type in the new table to store KV data with integer values:
// MXKV
=# CREATE TABLE data_1(
time timestamp with time zone,
tag_id int,
kv mxkv_int4
)
USING MARS3
DISTRIBUTED BY (tag_id)
ORDER BY (time,tag_id);
// MXKV2
=# CREATE TABLE data_2(
time timestamp with time zone,
tag_id int,
kv mxkv2_int8 encoding(compresstype=mxcustom,encodechain='auto')
)
USING MARS3
DISTRIBUTED BY (tag_id)
ORDER BY (time,tag_id);
Before using the MXKV type, in order to optimize compression and query performance, you must first make key-value definitions to determine the key-value collection contained in the data in advance.
MXKV provides UDF mxkv_import_keys
to complete the key value definition.
There are two ways to define it:
Manually provide KV samples in JSON form, MXKV will automatically extract key values and complete definitions.
The following SQL will extract the key values 'a' and 'b' in JSON and define:
=# SELECT mxkv_import_keys('{"a": 1, "b": 2}');
mxkv_import_keys
------------------
a
b
(2 rows)
Existing t_json
table, including field c2
of type JSON:
=# SELECT c2 FROM t_json ;
c2
----------
{"k1":1}
{"k2":2}
(2 rows)
From the results, we can see that c2 includes two key values, k1 and k2.
Use mxkv_import_keys
to define through table data:
=# SELECT mxkv_import_keys('t_json'::regclass, 'c2');
mxkv_import_keys
------------------
k1
k2
(2 rows)
As shown above, provide table names and column names in the parameters.
Compared to manual definition, this method is more convenient and faster, but the sample data has been stored in other tables as a prerequisite.
Notes!
The key values in MXKV must be defined before they can be inserted correctly, otherwise the following error will occur:DETAIL: The key is not imported yet HINT: Import the keys with the mxkv_import_keys() function
After the key value is determined, the following starts to insert KV data:
// MXKV
=# INSERT INTO data_1 VALUES(now(), 1, '{"a":1, "b":2}');
INSERT 0 1
// MXKV2
=# INSERT INTO data_2 VALUES(now(), 1, '{"a":1, "b":2}');
INSERT 0 1
As shown above, the inserted KV data contains the key values 'a' and 'b' that just defined.
The reading method of MXKV key-value content is the same as that of JSON, and both use the ->
symbol:
// MXKV
=# SELECT time, tag_id, kv->'a' as a, kv->'b' as b FROM data_1;
time | tag_id | a | b
---------------------------+------------------------------------------------------------------------------------------------------
2023-10-19 10:20:05.334169+00 | 1 | 1 | 2
(1 row)
// MXKV2
=# SELECT time, tag_id, kv->'a' as a, kv->'b' as b FROM data_2;
time | tag_id | a | b
---------------------------+------------------------------------------------------------------------------------------------------
2023-10-19 10:20:05.334169+00 | 1 | 1 | 2
(1 row)
The key values taken out are the same as ordinary columns, and can also be compared and joined with other tables. For example, create a table with the same structure as data_1
/data_2
table and insert the same data. Use the following SQL to do the connection:
// MXKV
=# SELECT * FROM data_1, data_3 WHERE data_1.kv->'a' = data_3.kv->'a';
time | tag_id | kv | time | tag_id | kv
----------------------------------------------------------------------------------------------------------------------------------
2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2} | 2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2}
(1 row)
// MXKV2
=# SELECT * FROM data_2, data_3 WHERE data_2.kv->'a' = data_3.kv->'a';
time | tag_id | kv | time | tag_id | kv
----------------------------------------------------------------------------------------------------------------------------------
2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2} | 2023-10-19 10:20:05.334169+00 | 1 | {"a": 1, "b": 2}
(1 row)
Notes!
In MXKV/MXKV2, the effects of->
are the same as->>
.
Data Type | Supported Data Formats |
---|---|
MXKV | MXKV_INT4: Store 32-bit integer value of type INT4/INT MXKV_FLOAT4: Store 32-bit floating point value of type FLOAT4/REAL MXKV_FLOAT8: Store 64-bit floating point value of type FLOAT8/FLOAT/DOUBLE PRECISION MXKV_TEXT: Store string value of type TEXT |
MXKV2 | MXKV2_INT8: Stores 64-bit integer value of type INT8/INT MXKV2_FLOAT8: Stores 64-bit floating point value of type FLOAT8/FLOAT/DOUBLE PRECISION MXKV2_TEXT: Stores string value of type TEXT MXKV2_BOOL: Stores boolean value of type BOOLEAN |
Where MXKV_FLOAT4 and MXKV_FLOAT8 can specify the number of decimal places, for example:
=# CREATE TABLE data_4(
time timestamp with time zone,
tag_id int,
kv mxkv_float4(2)
)
USING MARS3
DISTRIBUTED BY (tag_id)
ORDER BY (time,tag_id);
After this definition, the decimal point will be rounded after two digits.
After specifying the number of decimal places, the internal storage type will be converted to an integer, which is more conducive to optimization and compression. But the numerical range will also be reduced accordingly.
Notes!
MXKV/MXKV2 only supports first-level KV structures, and nesting and arrays are not allowed. For more in-depth technical information, please see the blog MXKV.