The MARS3 storage engine builds upon MARS2 with multiple optimizations. It adopts a hybrid row-column storage format, enabling both high-speed data ingestion and extreme compression.
MARS3 supports data updates and deletes via the UPDATE and DELETE clauses (except in Unique Mode).
MARS3 supports adding and dropping columns, as well as COPY and pg_dump operations.
Each MARS3 table internally uses an LSM Tree structure for storage. LSM Tree (Log-Structured Merge Tree) is a layered, ordered, disk-oriented data structure. Its core idea is to leverage disk performance through batched sequential writes, which significantly outperforms random writes.
Below is a diagram illustrating the internal structure of MARS3:
_1688719823.png)
We will analyze the above diagram layer by layer.
COLLATE C on this column can accelerate sorting.prefer_load_mode parameter. See Configuration below for details.rowstore_size parameter) can be tuned to optimize performance for different workloads.Overlapping data ranges among Runs cause read amplification and degrade query performance. When the number of Runs on disk exceeds a threshold, MARS3 merges multiple Runs into one via external sorting. This process is called merge.
During merging, data remains readable and writable:
After a merge completes, the original Runs are marked as obsolete based on transaction ID and enter a reclaimable state.
CREATE INDEX brin_idx ON t1 USING mars3_brin(time,tag_id);Dead data. You may also periodically run VACUUM to clean up Dead data.INSERT, then flushed into L0 Runs.DELETE for deletion. Deletions are recorded in the Delta file of the corresponding Run and physically removed during Run merging.UPDATE for updates. An update first deletes the old record, then inserts a new one.UPDATE clauses; simply using INSERT automatically performs the operation. To update a record with a specific Unique Key (i.e., the value of the sort key defined at table creation), insert a new record with the same Unique Key. For example: CREATE TABLE mars3_t(c1 int NOT NULL, c2 int) USING MARS3 WITH (uniquemode=true) ORDER BY (c1, c2);, where the Unique Key is (c1, c2).Note!
When Unique Mode is enabled, the first field in theORDER BYclause must include theNOT NULLconstraint during definition.
After creating the matrixts extension, the simplest way to create a MARS3 table is to add the CREATE TABLE clause with USING and ORDER BY. For extended examples, see Table Design Best Practices.
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float,
message text
) USING MARS3
ORDER BY (dev_id,ts);
Note!
Creating a BRIN index on MARS3 tables is supported but not required.
Starting from version 6.3.0, the requirement to use theORDER BYclause when creating MARS3 tables has been removed.
Note!
These parameters are table-level settings. They can only be specified at table creation using theWITH(mars3options='a=1,b=2,...')clause (except forcompress_threshold, which is shared with MARS2 and can be set directly viaWITH). Once set, they cannot be changed. These settings apply to a single table. For more information, see Database Table Parameters.
The following parameter controls the size of Runs in L0, indirectly affecting Run sizes in higher levels.
| Parameter | Unit | Default | Range | Description |
|---|---|---|---|---|
| rowstore_size | MB | 64 | 8 – 1024 | Controls when an L0 Run switches. A new Run is created when data exceeds this size. |
The following parameter sets the compression threshold, balancing compression efficiency and read performance. Too low a value reduces compression; too high increases memory usage.
| Parameter | Unit | Default | Range | Description |
|---|---|---|---|---|
| compress_threshold | Tuples | 1200 | 1 – 100000 | Compression threshold. Maximum number of tuples per column to compress in one batch. |
The following parameter sets the data loading mode in MARS3.
| Parameter | Unit | Default | Range | Description |
|---|---|---|---|---|
| prefer_load_mode | normal | normal / bulk | Data loading mode. normal means normal mode: new data is first written to row-format Runs in L0, then moved to columnar Runs in L1 after reaching rowstore_size. This involves one extra I/O compared to bulk, and column conversion is asynchronous. Suitable for high-frequency, small-batch writes where I/O capacity is sufficient and latency is critical. bluk means bulk load mode: used for low-frequency, large-batch inserts. Data is written directly to columnar Runs in L1, reducing one I/O compared to normal. Column conversion is synchronous. Suitable for scenarios with limited I/O capacity and low latency sensitivity. |
The following parameter controls the level size amplification factor.
| Parameter | Unit | Default | Range | Description |
|---|---|---|---|---|
| level_size_amplifier | 8 | 1 – 1000 | Level size amplification factor. Threshold for triggering level merges: rowstore_size * (level_size_amplifier ^ level). A higher value slows reads but speeds up writes. Choose based on workload (write-heavy vs. read-heavy, compression ratio, etc.). Note: Avoid excessive Run counts per level, as this may degrade query performance or block new inserts. |
Configuration example:
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float
) USING MARS3
WITH (compress_threshold=1200,mars3options='rowstore_size=64',compresstype=zstd, compresslevel=1)
DISTRIBUTED BY (dev_id)
ORDER BY (dev_id,ts)
PARTITION BY RANGE (ts)
( START ('2023-07-01 00:00:00') INCLUSIVE
END ('2023-08-01 00:00:00') EXCLUSIVE
EVERY (INTERVAL '1 day')
,DEFAULT PARTITION OTHERS);
matrixts_internal.mars3_level_stats: View the status of each Level in a MARS3 table. Useful for assessing table health, such as whether Runs are merging as expected and if Run counts are within normal range.matrixts_internal.mars3_files: View file status of a MARS3 table, including auxiliary and delta files (Data, Delta, Index files, etc.), to verify expected file states.matrixts_internal.mars3_info_brin: View the status of a specific BRIN index on a MARS3 table.The MARS2 storage engine is designed for high-speed data loading and querying, using ordered storage to eliminate search overhead.
Like MARS3, MARS2 uses an LSM Tree structure for storage.
Below is a diagram of MARS2's internal structure:
_1688719834.png)
We will analyze this diagram concept by concept.
COLLATE C on this column can speed up sorting.mars2_automerge_threshold), L0 merge is triggered, combining all L0 Runs into one.level0_upgrade_size), it is promoted to L1.level1_upgrade_size), L1 merge is triggered, combining all L1 Runs into one.level1_upgrade_size), it is promoted to L2.First, create the matrixts extension:
=# CREATE EXTENSION matrixts ;
Then explicitly declare min/max recording:
=# CREATE TABLE metrics (
ts timestamp ENCODING(minmax),
dev_id bigint ENCODING(minmax),
power float,
speed float,
message text
) USING MARS2;
Create the mars2_btree index:
=# CREATE INDEX ON metrics
USING mars2_btree (ts, dev_id);
Same as MARS3.
INSERT, then processed into a Run.TRUNCATE partitioned tables.After creating the matrixts extension, the simplest way to create a MARS2 table is to use the USING clause in CREATE TABLE and create an index. For extended examples, see Table Design Best Practices.
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float,
message text
) USING MARS2;
=# CREATE INDEX ON metrics
USING mars2_btree (ts, dev_id);
Note!
Table-level parameters can only be set at table creation using theWITHclause, apply to a single table, and cannot be changed afterward. Global parameters can be set at session or system level. System-level changes requiremxstop -uto take effect. For more information, see Database Table Parameters.
The following parameters control merging behavior, as described in the Level section above.
| Merge Control Parameter | Unit | Default | Range | Description |
|---|---|---|---|---|
| mars2_automerge_threshold | run | 32 | 10 – 2048 | Controls merge trigger for all MARS2 tables: number of L0 Runs to trigger merge. Global setting. Per-table override: level0_merge_threshold |
| level0_merge_threshold | run | 32 | 1 – 2048 | Controls per-table L0 merge threshold. Table-level setting |
| level0_upgrade_size | MB | 25 | 1 – 10000 | Controls per-table L0 → L1 upgrade size. If merged Run exceeds this, promote to L1. Table-level setting |
| level1_upgrade_size | MB | 1000 | 1 – 10000 | Controls per-table L1 → L2 upgrade size. If merged Run exceeds this, promote to L2. Table-level setting |
mars2_automerge_threshold / level0_merge_threshold to reduce merge frequency.level0_upgrade_size / level1_upgrade_size.The following parameter controls compression behavior. Too low a value reduces compression; too high increases memory usage.
| Compression Control Parameter | Unit | Default | Range | Description |
|---|---|---|---|---|
| compress_threshold | tuples | 1200 | 1 – 100000 | Compression threshold. Number of tuples per table to compress in one batch. Maximum tuple count per compression unit. Table-level setting. |
The following parameters control sort memory size during inserts. When inserting into multiple partitioned tables, each gets mars2_sort_mem_core; total usage does not exceed mars2_sort_mem.
| Sort Memory Parameter | Unit | Default | Range | Description |
|---|---|---|---|---|
| mars2_sort_mem | KB | 2097152KB (2GB) | 128KB – 2147483647KB (~2048GB) | Sort memory per insert. For partitioned tables, shared among partitions. Global setting. |
| mars2_sort_mem_core | KB | 16384KB (16MB) | 128KB – 2147483647KB (~2048GB) | Minimum sort memory per partitioned table. Global setting. |
Table-level configuration example:
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float,
message text
)
USING MARS2
WITH (compress_threshold=1200,level0_merge_threshold=32);
=# CREATE INDEX ON metrics
USING mars2_btree (ts, dev_id);
Session-level global parameter example:
=# SET mars2_sort_mem TO 2097152;
System-level global parameter example:
=# gpconfig -c mars2_sort_mem -v 2097152
=# \q
$ mxstop -u
HEAP is YMatrix's default storage engine, inherited from PostgreSQL. It supports only row storage, not column storage or compression. Based on MVCC, it is suitable for workloads with frequent updates and deletes.
Under MVCC, HEAP tables do not physically delete data during updates or deletes. Instead, old data is masked using version visibility metadata. As a result, frequent updates and deletes cause physical storage size to grow. You should periodically run VACUUM to reclaim space.
Use the following SQL to create a HEAP table in YMatrix:
=# CREATE TABLE disk_heap(
time timestamp with time zone,
tag_id int,
read float,
write float
)
DISTRIBUTED BY (tag_id);
Tables using AOCO or AORO as storage engines are collectively called AO (Append-Optimized) tables. Also known as append-optimized tables, they support insert, update, and delete operations and support compression.
AORO supports row storage; AOCO supports column storage.
AO tables differ significantly from HEAP tables in both logical and physical structure. As described earlier, HEAP tables use MVCC to manage data visibility after updates and deletes. In contrast, AO tables use an auxiliary bitmap table to track visible rows.
For AO tables with frequent updates and deletes, regular cleanup is also required. However, the cleanup tool vacuum must reset the bitmap and compress physical files, which is typically slower than HEAP cleanup.
Note!
For detailed information on storage engines, usage, and best practices, see Table Design Best Practices.