Table design best practices

Depending on the different characteristics of the storage engine, you can flexibly create different tables under different scenario requirements. We give the following example.

MARS3
MARS2
HEAP
AO

1 MARS3 table

MARS3 tables depend on the matrixts extension. Before building a table, you first need to create an extension in the database using the storage engine.

Notes!
matrixts is extended to the database level, and it can be created once in a database without repeated creation.

=# CREATE EXTENSION matrixts;

When creating tables, use USING MARS3 to specify the storage engine, and use ORDER BY to specify the sort key. The basic representation is as follows:

=# CREATE TABLE mars3(
    time timestamp with time zone,
    tag_id int,
    i4 int4,
    i8 int8
)
USING MARS3 ORDER BY (tag_id, time);

We extend to the following examples of Internet of Vehicle timing scenarios for further explanation.

=# CREATE TABLE vehicle_basic_data_mars3(
  daq_time timestamp ,
  vin varchar(32)  COLLATE "C" ,
  lng float ,
  lat float ,
  speed float ,
  license_template varchar(16) ,
  flag integer
)
USING MARS3
WITH (compresstype=zstd, compresslevel=3,compress_threshold=1200,
        mars3options='rowstore_size=64')
DISTRIBUTED BY (vin)
ORDER BY (vin, daq_time)
PARTITION BY RANGE (daq_time)
( START ('2022-07-01 00:00:00') INCLUSIVE
  END ('2022-08-01 00:00:00') EXCLUSIVE
  EVERY (INTERVAL '1 day')
,DEFAULT PARTITION OTHERS);

After successfully creating the MARS3 table, you can choose to create a mars3_brin index on the sort key. mars3_brin is a sparse index with minimal impact on disk space and insertion performance. Combined with the orderly characteristics of sort keys, the ultimate query efficiency of sort keys can be achieved.

The mars3_brin index can also be created in the non-sorted key, but because there is no order, there will be some discounts in query performance and can be added as needed.

=# CREATE INDEX idx_mars3 ON vehicle_basic_data_mars3 USING mars3_brin(vin, daq_time);

Combining the specific parameters and statements in the above example table, make best practice instructions:

Notes!
We hope you will design ideas based on our suggestions, but we don’t want you to blindly apply them. The specific timing scenarios are ever-changing, and specific analysis of specific situations is still very necessary.

Select the sort key

When designing a table, you must consider the selection of sorting keys. The purpose of the sorting key is to make data of the same time dimension or similar attributes physically close as possible to reduce the number of I/O addressing and improve query efficiency. Therefore, the selection of the sort key needs to comply with the main business query characteristics. For example, the requirement is a single-device point query, then the sort key is the device number (vin) in the timing scenario. If the requirement is a detailed query, aggregation query or multi-device query within a certain time period, then the sort key is the device number and timestamp (vin,daq_time).

We recommend that you select the identification field of the data subject. If there is a time stamp, you can add it to the identification field together as the sort key. In the example table is (vin, daq_time).

COLLATE "C"

This option is only added in the Device Encoding field, and no other fields need to be added. With this operation, you can improve the sorting and querying speed of text types.

USING MARS3

It is a fixed writing method using MARS3 tables and cannot change the content.

WITH (compresstype=zstd, compresslevel=3, mars3options='compress_threshold=1200, rowstore_size=64')

In this table, zstd is recommended, and compression level is level 3. This example uses a general compression algorithm, see [Using Compression] (/doc/5.1/reference/storage/compression).

In this table, the L0 Run size is 64MB; the compression threshold is 1200.

DISTRIBUTED BY (vin)

Use this statement to select the distribution key. Distribution keys determine how data is distributed to each data node (segment), which has a great impact on performance. The main selection criteria are: (1) Try to avoid data skew, so that the data is evenly distributed, and queries can be more fully parallel; (2) When it comes to multi-table connections, try to keep the connection keys consistent with the distribution keys at the beginning of the design to avoid data movement. In the example, we use the device encoding field as the distribution key, so that we can query the data of the same device and perform related calculations. This avoids performance losses caused by data redistribution between nodes.

PARTITION BY RANGE (daq_time)

Specifies the partition key for the partition table. We recommend choosing time as the partition key. On the one hand, it can automatically eliminate data in conjunction with [automatic partition management] (/doc/5.1/datamodel/apm). On the other hand, it can use partition cutting capabilities to filter time partitions that are not interested. It is recommended that you control the size of the partition to around one million to ten million data lines. Too low will cause too many partitions to consume memory, too high filtering is not good, and it is not convenient for data elimination. In the example, we use the time when the device collects data as the partition key. Usually a large number of queries are the time to filter data collection. For example, when you want to query the data within a day and perform related calculations, you need to add the filtering condition `WHERE daq_time >= CURRENT_DATE - INTERVAL '1 day', so that the database will quickly determine which partition subtable the data is on, so as to quickly locate the data and query the table.

( START ('2022-07-01 00:00:00') INCLUSIVE END ('2022-08-01 00:00:00') EXCLUSIVE EVERY (INTERVAL '1 days') ,DEFAULT PARTITION OTHERS);

This SQL statement indicates that you will use the START...END statement and the INCLUSIVE, EXCLUSIVE keywords to create a subpartition table that starts at 2022-07-01 zero point to 2022-08-01 zero point.

EVERY (INTERVAL '1 day')

The time span of the subpartition table is 1 day. In addition to using day, you can also use hour, month, year, etc. and determine it according to the size of the data volume. For example, in a day, your server receives up to 1 million to 10 million pieces of data, and using "1 day" as the time interval on that day is the best choice; if the data volume of a day is only tens of thousands or hundreds of thousands, then choosing "1 month" is very suitable; if the data volume per day is only tens of thousands, then the "1 year" subpartition table can be generated once.

DEFAULT PARTITION OTHERS

Indicates the default partition. If the corresponding subtable storage of data cannot be found in other partitions, it will be stored in this partition.

Use VACUUM/VACUUM FULL to achieve ultimate query performance

The function of executing VACUUM/VACUUM FULL: it can convert unordered row storage runs into ordered column storage runs, and VACUUM FULL removes the above VACUUM and merge multiple runs into one run, thereby achieving more extreme query performance. But VACUUM FULL requires an exclusive lock on the table it works and therefore cannot be paralleled with other uses of this table.

Time to execute VACUUM/VACUUM FULL: If you continue to write data to the MARS3 table, you do not need to care about the cleaning action. When the data in the table has stabilized, it is necessary to perform a cleanup operation before performing a query operation; if there is a change in the entire database or a certain table, it is also recommended to perform a cleanup after each change is completed.

If it is a data batch scenario, you can choose to enable Unique Mode

When data from the same time point in the device is reported in batches, MARS3 can merge data from the same device (the value of vin in this table) at the same time (the value of daq_time in this table).

The merge feature requires that uniquemode=true be manually specified when creating the table, because the default value of this option is false. For example, when you specify uniquemode=true, device 'A01' will send back 3 data at '2022-01-01 00:00:00', and finally the original two data will be overwritten according to the last data, only one data will be retained; but if you default uniquemode=false, then device 'A01' will send back '2022-01-01 00:00:00', the 3 data will be retained in the end without any processing.

Notes!
If Unique Mode is enabled, the first field of the ORDER BY clause needs to be added to the NOT NULL constraint when defining. Unique Mode of MARS3 does not support deletion for the time being.

If enabled, the example table creation statement is as follows:

=# CREATE TABLE vehicle_basic_data_mars3_um(
  daq_time timestamp ,
  vin varchar(32)  COLLATE "C" NOT NULL,
  lng float ,
  lat float ,
  speed float ,
  license_template varchar(16) ,
  flag integer
)
USING MARS3
WITH (compresstype=zstd, compresslevel=3,compress_threshold=1200,uniquemode=true,
        mars3options='rowstore_size=64')
DISTRIBUTED BY (vin)
ORDER BY (vin, daq_time)
PARTITION BY RANGE (daq_time)
( START ('2022-07-01 00:00:00') INCLUSIVE
  END ('2022-08-01 00:00:00') EXCLUSIVE
  EVERY (INTERVAL '1 day')
,DEFAULT PARTITION OTHERS);

2 MARS2 table

MARS2 tables depend on the matrixts extension. Before building a table, you first need to create an extension in the database using the storage engine.

Notes!
matrixts is extended to the database level, and it can be created once in a database without repeated creation.

=# CREATE EXTENSION matrixts;

When creating tables, use USING MARS2 to specify the storage engine, and the basic construction representation is as follows.

=# CREATE TABLE mars2(
    time timestamp with time zone,
    tag_id int,
    i4 int4,
    i8 int8
)
USING MARS2;

We extend to the following examples of Internet of Vehicle timing scenarios for further explanation.

=# CREATE TABLE vehicle_basic_data_mars2(
  daq_time timestamp  encoding (minmax),
  vin varchar(32)  COLLATE "C" encoding (minmax),
  lng float encoding (minmax),
  lat float encoding (minmax),
  speed float ,
  license_template varchar(16) ,
  flag integer
)
USING MARS2
WITH (compresstype=zstd, compresslevel=3)
DISTRIBUTED BY (vin)
PARTITION BY RANGE (daq_time)
( START ('2022-07-01 00:00:00') INCLUSIVE
  END ('2022-08-01 00:00:00') EXCLUSIVE
  EVERY (INTERVAL '1 day')
,DEFAULT PARTITION OTHERS);

After successfully creating the MARS2 table, you must create an additional index of type mars2_btree so that you can read and write normally.

The purpose of using index sorting is to make data of the same dimension or similar characteristics physically close as possible to reduce the number of I/O addressing and improve query efficiency. Therefore, the selection of sorting keys needs to comply with the main business query characteristics. For example, the requirement is a single-device point query, then the sort key is the device number (vin) in the timing scenario. If the requirement is a detailed query, aggregation query or multi-device query within a certain time period, then the sort key is the device number and timestamp (vin,daq_time).

=# CREATE INDEX idx_mars2 ON vehicle_basic_data_mars2 
USING mars2_btree(vin, daq_time);

Combining the specific parameters and statements in the above example table, make best practice instructions:

Notes!
We hope you will design ideas based on our suggestions, but we don’t want you to blindly apply them. The specific timing scenarios are ever-changing, and specific analysis of specific situations is still very necessary.

Select the sort key

When designing a table, you must consider the selection of sorting keys. We recommend that you select the identification field of the data subject. If there is a time stamp, you can add it to the identification field together as the sort key. In the example table is (vin, daq_time).

The sort key is part of the index, so this type of query can be located to the storage location through the index, thereby avoiding full table scanning. This is to optimize queries based on the data subject and allow the data to be stored in the order of the data subject.
Data from the same subject is stored continuously, which can reduce the number of I/O and improve query speed.

encoding (minmax)

This option is used to improve computing power.

First of all, it is recommended to add the minmax option to all sort keys, because the sort keys are ordered, and minmax filtering can also have a good filtering effect when scanning the full table.

Secondly, if you want to perform aggregated queries such as min, max, avg, sum, count, etc. based on this field, or you need to add this option when filtering the WHERE condition.

The addition of this option needs to be determined based on the business scenario. Do not add it to all fields without thinking, otherwise this operation will bring a lot of consumption of CPU and disk I/O for data writing.

COLLATE "C"

Same as MARS3 example above.

USING MARS2

It is a fixed writing method using the MARS2 table and cannot change the content.

WITH (compresstype=zstd, compresslevel=3)

In this table, zstd is recommended, and compression level is level 3. This example uses a general compression algorithm, see [Using Compression] (/doc/5.1/reference/storage/compression).

DISTRIBUTED BY (vin)

Same as MARS3 example above.

PARTITION BY RANGE (daq_time)

Same as MARS3 example above.

( START ('2022-07-01 00:00:00') INCLUSIVE END ('2022-08-01 00:00:00') EXCLUSIVE EVERY (INTERVAL '1 days'),default partition others);

Same as MARS3 example above.

EVERY (INTERVAL '1 day')

Same as MARS3 example above.

DEFAULT PARTITION OTHERS

Same as MARS3 example above.

If it is a data batch scenario, you can choose to enable Unique Mode

When data from the same time point in the device is reported in batches, MARS2 can merge data from the same device (the value of vin in this table) at the same time (the value of daq_time in this table).

The merge feature requires that uniquemode=true be manually specified when creating an index, because the default value of this option is false. For example, when you specify uniquemode=true, device 'A01' will send back 3 data at '2022-01-01 00:00:00', and finally the original two data will be overwritten according to the last data, only one data will be retained; but if you default uniquemode=false, then device 'A01' will send back '2022-01-01 00:00:00', the 3 data will be retained in the end without any processing.

If enabled, the index statement created in this example is as follows:

=# CREATE INDEX idx_mars2_um ON vehicle_basic_data_mars2_um 
USING mars2_btree(vin, daq_time) WITH(uniquemode = true);

Notes!
A MARS2 table only supports one index and cannot be deleted after creation.

3 HEAP table / AO table

3.1 HEAP table

The HEAP table is the YMatrix default storage engine, so if you do not specify the storage engine when creating the table, all the created HEAP tables are HEAP tables.

=# CREATE TABLE disk_heap(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
DISTRIBUTED BY (tag_id);

3.2 AO table

3.2.1 AORO table

The AORO table is the AO table that is stored. Row storage is the default storage method for AO tables.
AORO supports table-level compression, not column-level compression.

=# CREATE TABLE disk_aoro(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
WITH(appendonly=true,orientation=row,compresstype=zlib,compresslevel=4)
DISTRIBUTED BY (tag_id);

3.2.2 AOCO table

The AOCO table is the column-stored AO table.
AOCO supports table-level compression and column-level compression.

=# CREATE TABLE disk_aoco(
    time timestamp with time zone,
    tag_id int ENCODING(compresstype=zlib,compresslevel=5),
    read float,
    write float
)
WITH(appendonly=true,orientation=column,compresstype=zlib,compresslevel=6)
DISTRIBUTED BY (tag_id);

3.3 Choose one best practice

The best practices to determine whether to use HEAP or AO tables are:

Use the HEAP storage engine for tables and partitions that frequently update, delete, and single insert operations
Use the HEAP storage engine for tables and partitions that will receive concurrent updates, deletes, and insert operations
Use the AO storage engine for tables and partitions that are rarely updated after initial loading and are subsequently only performed in batch operations. Never perform a single update, delete, or insert operation on the AO table. Concurrent batch insertion operations can be performed but concurrent batch updates or deletes are never performed.
The space occupied by updated and deleted rows in the AO table will not be effectively recycled and reused like the HEAP table, so the AO storage engine is not suitable for frequently updated tables. Its design goal is to be used for large tables that are loaded at once, rarely updated, and frequently analyzed query processing.

简体中文

Table design best practices

1 MARS3 table

2 MARS2 table

3 HEAP table / AO table

3.1 HEAP table

3.2 AO table

3.2.1 AORO table

3.2.2 AOCO table

3.3 Choose one best practice