Storage Engine

The storage engine is the storage base of the database system. The database uses the storage engine to create, query, update and delete data. Depending on your needs, different storage engines will provide different storage mechanisms and design in different dimensions such as physical layout, index type, and lock granularity.
Currently, the MatrixDB database system supports the use of two storage engines: HEAP based on PostgreSQL, and MARS2 developed by MatrixDB. HEAP tables are storage engines provided by traditional PostgreSQL databases, also known as heap tables. This type of table supports a large number of concurrent read and write, transaction, index and other features. The MARS2 table is a storage engine developed on MARS1. With its physically ordered merging method, the number of IO addressing is reduced, thereby improving the query performance of data in the table. At the same time, MARS2 supports compression, column storage, automatic archiving, pre-aggregation and other functions, which performs superiorly in timing scenarios. Updates and deletion are not currently supported.

Create tables under different storage engines

Depending on the different characteristics of the two storage engines, you can flexibly create different tables under different scenario requirements. We give the following example.

1 Create a HEAP table

The HEAP table is the default storage engine of MatrixDB. Therefore, if you do not specify the storage engine when creating the table, the created HEAP tables are all HEAP tables.

CREATE TABLE disk(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
DISTRIBUTED BY (tag_id);

2 Create a MARS2 table

MARS2 tables rely on the matrixts extension. Before building a table, you first need to create an extension in the database using the storage engine.

Notes!
matrixts is extended to the database level, and it can be created once in a database without repeated creation.

CREATE EXTENSION matrixts;

When creating tables, use USING mars2 to specify the storage engine:

CREATE TABLE vehicle_basic_data_mars2(
daq_time timestamp  encoding (minmax),
vin varchar(32)  COLLATE "C" encoding (minmax),
lng float encoding (minmax),
lat float encoding (minmax),
speed float ,
license_template varchar(16) ,
flag integer
)
USING mars2
WITH (compresstype=zstd, compresslevel=3)
DISTRIBUTED BY (vin)
PARTITION BY RANGE (daq_time)
( START ('2022-07-01 00:00:00') INCLUSIVE
END ('2022-08-01 00:00:00') EXCLUSIVE
EVERY (INTERVAL '1 day')
,default partition others);

We will explain the specific parameters and statements in the vehicle_basic_data_mars2 table:

  1. encoding (minmax) This option is used to improve computing power. If you want to perform aggregated queries such as min, max, avg, sum, count, etc. based on this field, or you need to add this option when filtering the WHERE condition. The addition of this option needs to be determined based on the business scenario. Do not add it to all fields without thinking, otherwise this operation will bring a lot of consumption of data written to the CPU and disk IO.
  2. COLLATE "C" This option is only added in the Device Encoding field, and no other fields need to be added. With this operation, you can improve the sorting and querying speed of text types.
  3. USING mars2 It uses a fixed writing method of MARS2 tables, and cannot change the content and uppercase and uppercase.
  4. WITH (compresstype=zstd, compresslevel=3) Specify the compression algorithm as zstd, and also rle_type, zlib, and lz4 compression algorithms. It is recommended to use zstd in this table, and the compression level is 3. For detailed compression performance parameters, please refer to the table below.
  5. DISTRIBUTED BY (vin): Use this statement to select the distribution key. We recommend using the device encoding field as the distribution key, so that the data of the same device can be queried and related calculations can be performed. This avoids performance losses caused by data redistribution between nodes.
  6. PARTITION BY RANGE (daq_time) Specifies the partition key for the partition table. We recommend using the time when the device collects data as the partition key. Usually a large number of queries are the time to filter data collection. For example, when you want to query the data within a day and perform related calculations, you need to add the filtering condition `WHERE daq_time >= CURRENT_DATE - INTERVAL '1 day', so that the database will quickly determine which partition subtable the data is on, so as to quickly locate the data and query the table.
  7. ( START ('2022-07-01 00:00:00') INCLUSIVE END ('2022-08-01 00:00:00') EXCLUSIVE EVERY (INTERVAL '1 days') ,default partition others); This SQL statement indicates that you will use the START...END statement and the INCLUSIVE and EXCLUSIVE keywords to create a subpartition table that starts at 2022-07-01 zero point to before 2022-08-01 zero point.
  8. EVERY (INTERVAL '1 day') The time span of the subpartition table is 1 day. In addition to using day, you can also use hour, month, year, etc. and determine it according to the size of the data volume. For example, in a day, your server receives up to 1 million to 10 million pieces of data, and using "1 day" as the time interval on that day is the best choice; if the data volume of a day is only tens of thousands or hundreds of thousands, then choosing "1 month" is very suitable; if the data volume per day is only tens of thousands, then the "1 year" subpartition table can be generated once.
  9. default partition others Indicates the default partition. If the corresponding subtable storage of data cannot be found in other partitions, it will be stored in this partition.

Notes!
We hope you will design ideas based on our suggestions, but we don’t want you to blindly apply them. The specific timing scenarios are ever-changing, and specific analysis of specific situations is still very necessary.

The MARS2 table compression performance related parameters mentioned above:

Parameter name Default value Min value Maximum value Description
compress_threshold 1200 1 8000 Up limit of tuple compression for the same group
compressiontype lz4 -- -- Compression algorithm, supported:
1. zstd
2. zlib
3. lz4
compresslevel 1 1 -- Compression level. Usually, the smaller the value, the smaller the compression rate, but the faster the compression; the larger the value, the higher the compression rate, but the slower the compression. Different algorithms have different valid values ​​ranges:
zstd: 1-19
zlib: 1-9
lz4: 1-20

Notes!
Generally speaking, the higher the compression level, the higher the compression rate and the lower the speed. But that's not absolute.

After successfully creating the MARS2 table, you must create an additional index of type mars2_btree so that you can read and write normally. The purpose of using index sorting is to make data of the same dimension or similar characteristics physically close as possible to reduce the number of IO addressing and improve query efficiency. Therefore, the selection of sorting keys needs to comply with the main business query characteristics. For example, the requirement is a single-device point query, then the sort key is the device number (vin) in the timing scenario. If the requirement is a detailed query, aggregation query or multi-device query within a certain time period, then the sort key is the device number and timestamp (vin,daq_time).

CREATE INDEX idx_mars2 ON vehicle_basic_data_mars2 
USING mars2_btree(vin, daq_time);

Notes!
When data from the same time point in the device is reported in batches, MARS2 can merge data from the same device (the value of vin in this table) at the same time (the value of daq_time in this table). The merge feature requires that uniquemode=true be manually specified when creating an index, because the default value of this option is false. For example, when you specify uniquemode=true, device 'A01' will send back 3 pieces of data at '2022-01-01 00:00:00', and finally overwrite the original two pieces of data based on the last data, only one piece of data is retained; but if you default uniquemode=false, then device 'A01' will send back 3 pieces of data at '2022-01-01 00:00:00', in the end, all of them will be retained without any processing.