Table Design Best Practices

Based on the characteristics of different storage engines, you can flexibly create tables suited to various scenarios. Below are recommended practices.

MARS3
MARS2
HEAP
AO

1 MARS3 Tables

MARS3 tables depend on the matrixts extension. Before creating a table, you must first install this extension in the database where the MARS3 storage engine will be used.

Note!
The matrixts extension is database-level. Create it once per database; do not recreate it repeatedly.

=# CREATE EXTENSION matrixts;

Use USING MARS3 to specify the storage engine when creating a table, and use ORDER BY to define the sort key. A basic example is shown below:

=# CREATE TABLE mars3(
    time timestamp with time zone,
    tag_id int,
    i4 int4,
    i8 int8
)
USING MARS3 ORDER BY (tag_id, time);

We extend this to the following vehicle telematics time-series example for further explanation.

=# CREATE TABLE vehicle_basic_data_mars3(
  daq_time timestamp ,
  vin varchar(32)  COLLATE "C" ,
  lng float ,
  lat float ,
  speed float ,
  license_template varchar(16) ,
  flag integer
)
USING MARS3
WITH (compresstype=zstd, compresslevel=3,compress_threshold=1200,
        mars3options='rowstore_size=64')
DISTRIBUTED BY (vin)
ORDER BY (vin, daq_time)
PARTITION BY RANGE (daq_time)
( START ('2022-07-01 00:00:00') INCLUSIVE
  END ('2022-08-01 00:00:00') EXCLUSIVE
  EVERY (INTERVAL '1 day')
,DEFAULT PARTITION OTHERS);

After successfully creating a MARS3 table, you may choose to create a mars3_brin index on the sort key. The mars3_brin index is a sparse index with minimal impact on disk space and insert performance. Combined with the ordered nature of the sort key, it enables optimal query efficiency on the sort key.

You can also create a mars3_brin index on non-sort-key columns. However, due to lack of ordering, query performance will be somewhat reduced. Add such indexes only as needed.

=# CREATE INDEX idx_mars3 ON vehicle_basic_data_mars3 USING mars3_brin(vin, daq_time);

Based on the parameters and statements in the example above, here are best practice guidelines:

Note!
We encourage you to follow our recommendations when designing tables, but do not blindly copy them. Time-series use cases vary widely; always analyze your specific scenario.

Choose the Sort Key

When designing a table, carefully select the sort key. The purpose of the sort key is to place data with similar time dimensions or attributes physically close together, reducing I/O seeks and improving query performance. Therefore, the sort key should align with your primary query patterns. For point queries on a single device, use the device identifier (e.g., vin). For detailed, aggregate, or multi-device queries over time ranges, use both the device ID and timestamp (e.g., (vin, daq_time)).

We recommend using the main entity identifier as the sort key. If a timestamp exists, include it after the identifier. In the example, the sort key is (vin, daq_time).
COLLATE "C"

Apply this option only to device ID fields; do not apply it to other fields. This improves sorting and query performance for text-type columns.
USING MARS3

This is a fixed syntax for MARS3 tables and must not be altered.
WITH (compresstype=zstd, compresslevel=3, mars3options='compress_threshold=1200, rowstore_size=64')

We recommend using zstd compression at level 3. This example uses general-purpose compression. For custom compression settings, see Using Compression.

In this table, the L0 Run size is 64MB and the compression threshold is 1200.
DISTRIBUTED BY (vin)

Use this clause to specify the distribution key.
The distribution key determines how data is distributed across segments and significantly impacts performance. Key considerations:
- Avoid data skew to ensure even distribution and maximize parallelism.
- For multi-table joins, align the join key with the distribution key early in design to avoid data movement.
In the example, vin is used as the distribution key so that all data for a single device resides on the same segment, enabling efficient local computation and avoiding inter-node data redistribution.
PARTITION BY RANGE (daq_time)

Specify the partitioning key for partitioned tables.
We recommend using time as the partition key. This allows automatic data expiration via Automatic Partition Management and leverages partition pruning to filter out irrelevant time partitions.

Aim for partitions containing between one million and ten million rows. Too many small partitions consume excessive memory; too few reduce pruning effectiveness and complicate data lifecycle management.

In the example, daq_time is the partition key. Most queries filter by data collection time. For example, adding a condition like WHERE daq_time >= CURRENT_DATE - INTERVAL '1 day' allows the database to quickly identify the relevant partition and retrieve data efficiently.
( START ('2022-07-01 00:00:00') INCLUSIVE END ('2022-08-01 00:00:00') EXCLUSIVE EVERY (INTERVAL '1 days') ,DEFAULT PARTITION OTHERS);

This SQL statement indicates that subpartitions will be created from 2022-07-01 00:00:00 to just before 2022-08-01 00:00:00, using the START...END clause and INCLUSIVE、EXCLUSIVE keyword.
EVERY (INTERVAL '1 day')

Subpartition intervals are set to 1天. You may use hour, month, or year instead of day, depending on data volume. For instance:
- If your system receives 1M–10M records per day, use 1 day.
- If daily volume is tens to hundreds of thousands, 1 month is appropriate.
- If daily volume is under 10K, 1 year is sufficient.
DEFAULT PARTITION OTHERS

Defines the default partition. Data with timestamps not matching any defined range is stored here.
Use VACUUM/VACUUM FULL for Optimal Query Performance

Running VACUUM/VACUUM FULL converts unordered row-store runs into ordered column-store runs.
VACUUM FULL performs the same conversion and additionally merges multiple runs into one, achieving even better query performance. However, VACUUM FULL requires an exclusive lock on the table, so it cannot run concurrently with other operations on the same table.

When to run VACUUM/VACUUM FULL:
- During continuous data ingestion, no cleanup is needed.
- When data becomes stable, perform cleanup before running queries.
- After large-scale data changes (bulk updates/inserts), run cleanup once the operation completes.
Enable Unique Mode for Batch Data Ingestion

When data from the same device arrives in batches at the same timestamp, MARS3 can merge duplicate entries (same daq_time and vin).

This feature must be explicitly enabled via uniquemode=true at table creation, as the default is false. For example:
- With uniquemode=true, three records from device 'A01' at '2022-01-01 00:00:00' will be merged, keeping only the last one.
- With uniquemode=false (default), all three records are retained.
Note!
To enable Unique Mode, the first column in the ORDER BY clause must have a NOT NULL constraint. MARS3 Unique Mode does not support deletion.

Example DDL with Unique Mode enabled:
```
 =# CREATE TABLE vehicle_basic_data_mars3_um(
   daq_time timestamp ,
   vin varchar(32)  COLLATE "C" NOT NULL,
   lng float ,
   lat float ,
   speed float ,
   license_template varchar(16) ,
   flag integer
 )
 USING MARS3
 WITH (compresstype=zstd, compresslevel=3,compress_threshold=1200,uniquemode=true,
         mars3options='rowstore_size=64')
 DISTRIBUTED BY (vin)
 ORDER BY (vin, daq_time)
 PARTITION BY RANGE (daq_time)
 ( START ('2022-07-01 00:00:00') INCLUSIVE
   END ('2022-08-01 00:00:00') EXCLUSIVE
   EVERY (INTERVAL '1 day')
 ,DEFAULT PARTITION OTHERS);
```

2 MARS2 Tables

MARS2 tables depend on the matrixts extension. Install the extension before creating tables.

Note!
The matrixts extension is database-level. Install it once per database.

=# CREATE EXTENSION matrixts;

Use USING MARS2 to specify the MARS2 storage engine. Example:

=# CREATE TABLE mars2(
    time timestamp with time zone,
    tag_id int,
    i4 int4,
    i8 int8
)
USING MARS2;

Extended example for vehicle telematics:

=# CREATE TABLE vehicle_basic_data_mars2(
  daq_time timestamp  encoding (minmax),
  vin varchar(32)  COLLATE "C" encoding (minmax),
  lng float encoding (minmax),
  lat float encoding (minmax),
  speed float ,
  license_template varchar(16) ,
  flag integer
)
USING MARS2
WITH (compresstype=zstd, compresslevel=3)
DISTRIBUTED BY (vin)
PARTITION BY RANGE (daq_time)
( START ('2022-07-01 00:00:00') INCLUSIVE
  END ('2022-08-01 00:00:00') EXCLUSIVE
  EVERY (INTERVAL '1 day')
,DEFAULT PARTITION OTHERS);

After creating a MARS2 table, you must create a mars2_btree index to enable normal read and write operations.

The index enforces physical data ordering, placing similar or related data close together to reduce I/O and improve query speed. Therefore, the sort key should reflect common query patterns. For single-device queries, use the device ID (vin). For time-range or multi-device queries, use (vin, daq_time).

=# CREATE INDEX idx_mars2 ON vehicle_basic_data_mars2 
USING mars2_btree(vin, daq_time);

Best practices based on the example:

Note!
Follow our guidance, but avoid blind replication. Analyze your specific use case.

Choose the Sort Key

Select the sort key during table design. We recommend using the primary entity identifier. If a timestamp is available, include it after the identifier. In the example, the sort key is (vin, daq_time).

Since the sort key is part of the index, queries on it can be efficiently located without full table scans. This optimizes queries starting from the entity, ensuring related data is stored contiguously, reducing I/O and speeding up access.
encoding (minmax)

This option enhances computational efficiency.
- Apply minmax encoding to all sort key columns. Since these are ordered, minmax filters are effective during full scans.
- Use minmax for columns involved in min, max, avg, sum, count, or WHERE condition filtering.
Apply this judiciously. Overuse increases CPU and I/O overhead during data ingestion.
COLLATE "C"

Same as in the MARS3 example.
USING MARS2

Fixed syntax for MARS2 tables; do not modify.
WITH (compresstype=zstd, compresslevel=3)

Use zstd compression at level 3. See Using Compression for custom configurations.
DISTRIBUTED BY (vin)

Same as in the MARS3 example.
PARTITION BY RANGE (daq_time)

Same as in the MARS3 example.
( START ('2022-07-01 00:00:00') INCLUSIVE END ('2022-08-01 00:00:00') EXCLUSIVE EVERY (INTERVAL '1 days'),default partition others);

Same as in the MARS3 example.
EVERY (INTERVAL '1 day')

Same as in the MARS3 example.
Enable Unique Mode for Batch Data

MARS2 supports merging duplicate records (same daq_time and vin) when data arrives in batches.

Enable this via uniquemode=true when creating the index, as the default is false. For example:
- With uniquemode=true, three records from device 'A01' at '2022-01-01 00:00:00' are merged into one (last record wins).
- With uniquemode=false (default), all three are kept.
Example index DDL with Unique Mode:
```
 =# CREATE INDEX idx_mars2_um ON vehicle_basic_data_mars2_um 
 USING mars2_btree(vin, daq_time) WITH(uniquemode = true);
```
Note!
A MARS2 table supports only one index, which cannot be dropped after creation.

3 HEAP Tables / AO Tables

3.1 HEAP Tables

HEAP is the default storage engine in YMatrix. If no storage engine is specified, the table is created as a HEAP table.

=# CREATE TABLE disk_heap(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
DISTRIBUTED BY (tag_id);

3.2 AO Tables

3.2.1 AORO Tables

AORO tables are row-oriented AO tables. Row orientation is the default for AO tables.
AORO supports table-level compression only, not column-level compression.

=# CREATE TABLE disk_aoro(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
WITH(appendonly=true,orientation=row,compresstype=zlib,compresslevel=4)
DISTRIBUTED BY (tag_id);

3.2.2 AOCO Tables

AOCO tables are column-oriented AO tables.
AOCO supports both table-level and column-level compression.

=# CREATE TABLE disk_aoco(
    time timestamp with time zone,
    tag_id int ENCODING(compresstype=zlib,compresslevel=5),
    read float,
    write float
)
WITH(appendonly=true,orientation=column,compresstype=zlib,compresslevel=6)
DISTRIBUTED BY (tag_id);

3.3 Choosing Between HEAP and AO

Best practices for selecting between HEAP and AO:

Use the HEAP storage engine for tables or partitions that undergo frequent updates, deletes, or individual inserts.
Use HEAP for tables or partitions subject to concurrent updates, deletes, or inserts.
Use the AO storage engine for tables or partitions that are initially loaded and rarely updated, with subsequent inserts performed in batches. Never perform single-row UPDATE, DELETE, or INSERT operations on AO tables.
Concurrent bulk inserts are allowed on AO tables, but concurrent bulk updates or deletes are not recommended.
Space occupied by deleted or updated rows in AO tables is not reclaimed as efficiently as in HEAP tables. Therefore, AO is unsuitable for frequently updated tables.
AO is designed for large tables that are loaded once, rarely modified, and frequently queried for analytical processing.

← Previous

Introduction

Compression

English Русский 简体中文