Extended Data Types

This document describes the extended data types supported by YMatrix.

In the relational model, common data types represent individual attributes in columns, such as name, height, and weight. However, in time-series scenarios, the following issues often arise:

The set of metrics is not known in advance, meaning the table schema may need frequent changes.
The metric set is too large, exceeding YMatrix's maximum column limit of 1600.

In such cases, extended data types are required—these allow multiple attributes to be stored within a single column.

There are many ways to implement extended data types, such as using custom serialization formats encoded into string-type columns. YMatrix provides two extended data types:

JSON / jsonb
MXKV

1 JSON / jsonb

The JSON and jsonb data types are inherited from PostgreSQL. Unlike storing raw JSON strings in text columns, JSON and jsonb provide built-in functions for common operations such as extracting key-value pairs, merging, and deleting elements.

JSON and jsonb are functionally similar and can be converted between each other. The differences are:

JSON stores data as plain text; jsonb stores it in binary format.
JSON has faster write performance because no binary conversion is needed during insertion.
jsonb offers faster parsing due to its internal binary representation.

Below are common operations for JSON / jsonb:

1.1 Extracting Values Using ->

Use the -> operator to extract values by key:

SELECT '{"a":1, "b":2, "c":3}'::json->'a';
 ?column?
----------
 1
(1 row)

For nested JSON structures, chain multiple -> operators:

SELECT '{"a":1, "b":2, "c":{"d":2}}'::json->'c'->'d';
 ?column?
----------
 2
(1 row)

1.2 Deleting Keys

Use the - operator to delete a key. Note that write operations require the value to be of type jsonb:

SELECT '{"a":1, "b":2, "c":3}'::jsonb - 'a';
     ?column?
------------------
 {"b": 2, "c": 3}
(1 row)

1.3 Merging Key-Value Pairs

Use the || operator to merge two jsonb objects:

SELECT '{"a":1, "b":2}'::jsonb || '{"c":3}'::jsonb;
         ?column?
--------------------------
 {"a": 1, "b": 2, "c": 3}
(1 row)

1.4 Updating Values

When keys overlap, use || to update values:

SELECT '{"a":1, "b":2}'::jsonb || '{"b":3}'::jsonb;
     ?column?
------------------
 {"a": 1, "b": 3}
(1 row)

For more JSON operations, refer to the PostgreSQL documentation.

2 MXKV

A video tutorial on this topic is available at YMatrix Data Modeling and Spatiotemporal Distribution Model

When using YMatrix for time-series data collection, you may encounter the following situations:

The number of metrics exceeds YMatrix’s maximum column limit of 1600.
Different device models report vastly different sets of metrics, resulting in many NULL values across columns.
The metric set is unpredictable, leading to frequent schema changes.

Traditional solutions include:

Serializing multiple columns into a variable-length field.
Using JSON or jsonb data types.

While these approaches address flexibility, they suffer from suboptimal performance.

MXKV was developed to solve these issues. MXKV is a high-performance, scalable storage type developed by YMatrix. It is used similarly to JSON but offers advantages: MXKV uses binary storage and integrates with YMatrix compression algorithms, resulting in reduced storage footprint and higher query efficiency.

Note!
This feature is available only in the Enterprise Edition.

2.1 Usage

2.1.1 Create Extension and Verify

The MXKV data type is included in the MatrixTS extension. First, create the extension:

CREATE EXTENSION matrixts;

Verify that MXKV is enabled:

mxadmin=# SELECT '{}'::mxkv_text;
 mxkv_text
-----------
 {}
(1 row)

As shown above, MXKV is now active.

2.1.2 Create Table with MXKV Column

Create a new table and define an MXKV column using the integer-valued type mxkv_int4 to store integer key-value data:

CREATE TABLE data(
    time timestamp with time zone,
    tag_id int,
    kv mxkv_int4
)
DISTRIBUTED BY (tag_id);

2.1.3 Define Keys

Before using MXKV, define the keys to optimize compression and query performance. This step pre-defines the set of keys expected in the data.

MXKV provides the UDF mxkv_import_keys to define keys.

Two methods are available:

Manual Definition

Provide a sample JSON object manually. MXKV extracts and registers the keys automatically.

The following SQL extracts and defines keys 'a' and 'b':

mxadmin=# SELECT mxkv_import_keys('{"a": 1, "b": 2}');
 mxkv_import_keys
------------------
 a
 b
(2 rows)

Definition from Existing Table Data

Suppose there is a table t_json with a JSON column c2:

mxadmin=# SELECT c2 FROM t_json ;
    c2
----------
 {"k1":1}
 {"k2":2}
(2 rows)

From the output, we see that c2 contains keys k1 and k2.

Use mxkv_import_keys to define keys from table data:

mxadmin=# SELECT mxkv_import_keys('t_json'::regclass, 'c2');
 mxkv_import_keys
------------------
 k1
 k2
(2 rows)

As shown, pass the table name and column name as parameters.
This method is more convenient and efficient, provided that representative sample data already exists in another table.

2.1.4 Insert Data

After defining the keys, insert KV data:

mxadmin=# INSERT INTO data VALUES(now(), 1, '{"a":1, "b":2}');
INSERT 0 1

The inserted KV data includes the previously defined keys 'a' and 'b'.

Note!
Keys must be defined before insertion. Otherwise, the following error occurs:
Key not found in MXKV key registry

2.1.5 Reading Key Values

Reading key values from MXKV is similar to JSON, using the -> operator:

mxadmin=# SELECT time, tag_id, kv->'a' as a, kv->'b' as b FROM data;
             time              | tag_id | a | b
-------------------------------+--------+---+---
 2021-07-02 13:23:30.405473+00 |      1 | 1 | 2
(1 row)

Extracted key values behave like regular columns and support comparisons and joins. For example, create a second table data1 with the same structure and insert identical data. Then perform a join:

mxadmin=# SELECT * FROM data, data1 WHERE data.kv->'a' = data1.kv->'a';
             time              | tag_id |        kv        |             time              | tag_id |        kv
-------------------------------+--------+------------------+-------------------------------+--------+------------------
 2021-07-02 13:23:30.405473+00 |      1 | {"a": 1, "b": 2} | 2021-07-02 13:26:31.828685+00 |      1 | {"a": 1, "b": 2}
(1 row)

Note!
In MXKV, -> and ->> have the same effect.

2.2 MXKV Data Types

The earlier example used mxkv_int4. Below is a detailed description of the MXKV data types supported:

mxkv_int4: Stores 32-bit integers (int4 / int).
mxkv_float4: Stores 32-bit floating-point numbers (float4 / real).
mxkv_float8: Stores 64-bit floating-point numbers (float8 / float / double precision).
mxkv_text: Stores text strings.

For mxkv_float4 and mxkv_float8, you can specify the number of decimal places. For example:

CREATE TABLE data(
    time timestamp with time zone,
    tag_id int,
    kv mxkv_float4(2)
)
DISTRIBUTED BY (tag_id);

With this definition, digits beyond two decimal places are rounded. Specifying scale converts the internal storage to integer form, improving compression and efficiency—but reduces the overall value range.

mxkv_float4(scale), scale >= 0
- mxkv_float4(0): [-2147483500, 2147483500]
- mxkv_float4(2): [-21474835.00, 21474835.00]
- mxkv_float4(4): [-214748.3500, 214748.3500]
- ......
mxkv_float8(scale), scale >= 0
- mxkv_float8(0): [-9223372036854775000, 9223372036854775000]
- mxkv_float8(2): [-92233720368547750.00, 92233720368547750.00]
- mxkv_float8(4): [-922337203685477.5000, 922337203685477.5000]
- ......

Note!
MXKV supports only flat key-value structures. Nested objects and arrays are not allowed.
For deeper technical details, see the blog post MXKV.

Version Release History

English Русский 简体中文