Extensible Data Type: mxkv

The instructional video for this course is included in MatrixDB Data Modeling and Spatiotemporal Distribution Model

When using MatrixDB time-series database to collect data, you may encounter the following scenarios:

  1. The number of metrics to collect exceeds PostgreSQL's maximum column limit of 1600.
  2. Different device models have significantly different sets of collected metrics, resulting in many NULL values when reporting data.
  3. The metric set is unpredictable, meaning the table schema may need frequent changes.

Traditional solutions to these issues include:

  1. Serializing multiple columns into a variable-length field.
  2. Using PostgreSQL's built-in json or jsonb types.

While these approaches reduce operational complexity, they deliver suboptimal performance.

To address this, mxkv was developed—a custom key-value (KV) data type designed specifically for MatrixDB. Similar to JSON, it uses column names as KV keys. It enables easy extensibility, improves query performance, and reduces storage space.

Note: This feature is available only in the Enterprise Edition.

Usage

Create Extension and Verify

The mxkv custom data type is included in the matrixts extension. First, create the extension:

create extension matrixts;

Verify that mxkv is enabled:

mxadmin=# select '{}'::mxkv_text;
 mxkv_text
-----------
 {}
(1 row)

As shown above, mxkv is now active.

Create Table with mxkv Column

Create a new table and define an mxkv_int4 column to store integer-valued KV data:

CREATE TABLE data(
    time timestamp with time zone,
    tag_id int,
    kv mxkv_int4
)
Distributed by (tag_id);

Define Keys

Before using the mxkv type, define the keys to optimize compression and query performance. Key definition establishes the set of keys expected in the data.

mxkv provides the UDF mxkv_import_keys() to perform key definition.

There are two ways to define keys:

Manual Definition

Provide a sample JSON object manually. mxkv will extract the keys automatically.

The following SQL extracts keys 'a' and 'b' from the JSON and defines them:

mxadmin=# select mxkv_import_keys('{"a": 1, "b": 2}');
 mxkv_import_keys
------------------
 a
 b
(2 rows)

Define from Existing Table Data

Suppose there is a table t_json with a json-typed column c2:

mxadmin=# select c2 from t_json ;
    c2
----------
 {"k1":1}
 {"k2":2}
(2 rows)

The column c2 contains two keys: k1 and k2.

Use mxkv_import_keys() to define keys from table data:

mxadmin=# select mxkv_import_keys('t_json'::regclass, 'c2');
 mxkv_import_keys
------------------
 k1
 k2
(2 rows)

As shown, pass the table name and column name as parameters. This method is more convenient and efficient than manual definition, provided sample data already exists in another table.

Insert Data

After defining the keys, insert KV data:

mxadmin=# insert into data values(now(), 1, '{"a":1, "b":2}');
INSERT 0 1

The inserted KV data includes the previously defined keys 'a' and 'b'.

Note: Keys must be defined before insertion. Otherwise, an error occurs:

mxadmin=# insert into data values(now(), 1, '{"c":1}');
psql: ERROR:  unknown key "c"
LINE 1: insert into data values(now(), 1, '{"c":1}');
                                          ^
DETAIL:  The key is not imported yet
HINT:  Import the keys with the mxkv_import_keys() function

Read Key Values

Reading key values from mxkv is similar to JSON, using the -> operator:

mxadmin=# select time, tag_id, kv->'a' as a, kv->'b' as b from data;
             time              | tag_id | a | b
-------------------------------+--------+---+---
 2021-07-02 13:23:30.405473+00 |      1 | 1 | 2
(1 row)

Extracted key values behave like regular columns and can be used in comparisons or joins. For example, create a table data1 with the same structure as data, insert identical data, and perform a join:

mxadmin=# select * from data, data1 where data.kv->'a' = data1.kv->'a';
             time              | tag_id |        kv        |             time              | tag_id |        kv
-------------------------------+--------+------------------+-------------------------------+--------+------------------
 2021-07-02 13:23:30.405473+00 |      1 | {"a": 1, "b": 2} | 2021-07-02 13:26:31.828685+00 |      1 | {"a": 1, "b": 2}
(1 row)

Note: In mxkv, the operators -> and ->> are functionally equivalent.

Data Types

The earlier example used mxkv_int4. mxkv supports the following types:

  1. mxkv_int4: Stores 32-bit integer values (int4/int).
  2. mxkv_float4: Stores 32-bit floating-point values (float4/real).
  3. mxkv_float8: Stores 64-bit floating-point values (float8/float/double precision).
  4. mxkv_text: Stores string values of type text.

For mxkv_float4 and mxkv_float8, you can specify the number of decimal places. For example:

CREATE TABLE data(
    time timestamp with time zone,
    tag_id int,
    kv mxkv_float4(2)
)
Distributed by (tag_id);

With this definition, digits beyond two decimal places are rounded. Specifying scale converts internal storage to integer format, improving compression and efficiency—but reduces the supported value range.

  • mxkv_float4(scale), scale >= 0
    • mxkv_float4(0): [-2147483500, 2147483500]
    • mxkv_float4(2): [-21474835.00, 21474835.00]
    • mxkv_float4(4): [-214748.3500, 214748.3500]
    • ......
  • mxkv_float8(scale), scale >= 0
    • mxkv_float8(0): [-9223372036854775000, 9223372036854775000]
    • mxkv_float8(2): [-92233720368547750.00, 92233720368547750.00]
    • mxkv_float8(4): [-922337203685477.5000, 922337203685477.5000]
    • ......

Limitation: mxkv supports only flat, single-level KV structures. Nested objects and arrays are not allowed.