The MARS3 storage engine uses a row-column hybrid storage method to achieve high-speed data writing and extreme compression.
MARS3 supports data updates and deletions through the UPDATE
and DELETE
clauses (except in Unique Mode).
MARS3 supports adding and deleting columns, as well as COPY
and pg_dump
operations.
For each MARS3 single table, the internal storage uses an LSM Tree structure. An LSM Tree (Log-Structured Merge Tree) is a hierarchical, ordered, disk-oriented data structure. Its core principle is to fully leverage disk performance for batch sequential write operations, achieving significantly higher performance than random writes.
The internal architecture of MARS3 is illustrated below:
We will analyze the above diagram layer by layer in a conceptual manner.
prefer_load_mode
parameter. For details, see the following configuration items.rowstore_size
parameter) can be flexibly adjusted to optimize performance across different scenarios.CREATE INDEX brin_idx ON t1 USING mars3_brin(time,tag_id);
Dead
data, though you can also schedule regular use of VACUUM
to clean up Dead
data.INSERT
, then written to Run in L0.DELETE
for deletion. Deletions are recorded in the Delta file corresponding to the Run, and the data is actually deleted during Run merging.UPDATE
for updates. Updates first delete the original data and then reinsert a new data record.UPDATE
clause; simply executing an INSERT
clause will automatically complete the operation. To update a specific data record associated with a Unique Key (i.e., the specific key value corresponding to the sort key specified when creating the table), simply insert a new data record with the same Unique Key. For example, CREATE TABLE mars3_t(c1 int NOT NULL, c2 int) USING MARS3 WITH (uniquemode=true) ORDER BY (c1, c2);
, where the Unique Key is (c1, c2)
.Notes!
If Unique Mode is enabled, the first field in theORDER BY
clause must have aNOT NULL
constraint added when it is defined.
Assuming that the matrixts
extension has already been created, the simplest way to create a table is to add the USING
clause to the CREATE TABLE
statement and append the ORDER BY
clause. For more examples, see Table Design Best Practices.
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float,
message text
) USING MARS3
ORDER BY (dev_id,ts);
Notes!
The MARS3 table supports creating Brin indexes, but is not required; the MARS3 table must use theORDER BY
clause to create the sort key.
Notes!
This part of the configuration items is a table-level configuration item. You can only use theWITH(mars3options='a=1,b=2,...')
clause configuration when creating data tables (except the parametercompress_threshold
that is also supported by MARS2, and you can only use theWITH
clause configuration directly). It is suitable for a single table and cannot be modified once configured. For more information, see Data Table Configuration Parameters.
The following parameters are used to adjust the size of the L0 layer Run, and can also indirectly control the size of the Run above the L1 layer.
| Parameters | Units | Defaults | Value Range | Description | | -- | -- | -- | -- | -- | -- | -- | | rowstore_size | MB | 64 | 8 ~ 1024 | Used to control when L0 Run switches. When the data size exceeds this value, the next Run will be switched |
The following parameters are used to set the compression threshold, which can be used to adjust the compression effect and improve the reading efficiency. If the compression effect is not obvious when configured too low, the compression effect is consumed more memory when configured too high.
| Parameters | Units | Defaults | Value Range | Description | | -- | -- | -- | -- | -- | -- | -- | | compress_threshold | Tuple | 1200 | 1 ~ 100000 | Compression threshold. Used to control how many tuples (tuples) in each column of a single table are compressed at one time, which is the upper limit of the number of Tuples compressed in the same unit |
The following parameters are used to specify the loading mode of data in MARS3.
| Parameters | Units | Defaults | Value Range | Description |
| -- | -- | -- | -- | -- | -- | -- |
| prefer_load_mode | | normal | normal / bluk | Data loading mode. normal
represents the normal mode. The newly written data is first written to the row storage run in the L0 layer. After accumulating in the rowstore_size
, the column storage run falls to the L1 layer. Compared with the bulk
mode, the column storage conversion will change from synchronous to asynchronous, but is suitable for high-frequency small batch writing scenerios with sufficient I/O capabilities and delay-sensitive high-frequency small batch writing scenerios; bluk
represents the batch loading mode, suitable for low-frequency large batch writing scenerios, directly writing to the column storage run at the L1 layer. Compared with the normal
mode, one I/O is reduced. The column storage conversion will change from asynchronous to synchronous, and is suitable for low-frequency large batch writing scenarios with insufficient I/O capabilities and insensitive to delay |
The following parameters are used to specify the enlargement consistency of Level size.
| Parameters | Units | Defaults | Value Range | Description |
| -- | -- | -- | -- | -- | -- | -- |
| level_size_amplifier | | 8 | 1 ~ 1000 | Level Size Amplification Factor. Level The threshold for triggering a merge operation is calculated as: rowstore_size * (level_size_amplifier ^ level)
. The larger the value, the slower the reading speed and the faster the writing speed. The specific value can be determined based on the specific scenario information (more writes, less reads, less reads, compression rate, etc.). Note: Make sure that the number of runs per Level does not exceed the number of runs, otherwise it will affect query performance and even prevent new data from being inserted |
Configuration example:
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float,
message text
) USING MARS3
WITH (compress_threshold=1200,mars3options='rowstore_size=64')
ORDER BY (dev_id,ts);
matrixts_internal.mars3_level_stats
: Check the status of each Level level of the MARS3 table, and based on this, you can judge the health of the MARS3 table, such as whether Run has merged as expected, whether its number meets expectations, etc.;matrixts_internal.mars3_files
: Check the file status of the MARS3 table, which can be used to see if the extended files and incremental files of the MARS3 table (Data files, Delta files, Index files, etc.) meet expectations;matrixts_internal.mars3_info_brin
: View the status of a Brin index in the MARS3 table.The MARS2 storage engine is mainly designed for high-speed data loading and querying, and it uses ordered storage to save search.
The MARS2 table is the same as MARS3 and also uses the LSM Tree structure and is stored. The internal schematic diagram of MARS2 is as follows: We will interpret the above picture in the form of conceptual analysis layer by layer.
COLLATE C
in this column can speed up sorting.mars2_automerge_threshold
), triggers L0 merge and merges all runs of L0 into one Run.level0_upgrade_size
), upgrade it to L1.level1_upgrade_size
) after upgrading to L1, a merge is triggered and all runs of L1 are merged into one Run.level1_upgrade_size
), upgrade it to L2.First create the matrixts
extension:
=# CREATE EXTENSION matrixts ;
An explicit declaration of record minimum maximum value.
=# CREATE TABLE metrics (
ts timestamp ENCODING(minmax),
dev_id bigint ENCODING(minmax),
power float,
speed float,
message text
) USING MARS2;
Creates the mars2_btree
index.
=# CREATE INDEX ON metrics
USING mars2_btree (ts, dev_id);
Same as MARS3.
INSERT
, and then goes through these processes to generate a Run.TRUNCATE
partition table.On the premise that the matrixts extension has been created, the simplest way to create tables is to add the USING
clause to the CREATE TABLE statement and create an index. For extended examples, see [Table Design Best Practices] (/doc/5.1/reference/storage/table_design).
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float,
message text
) USING MARS2;
=# CREATE INDEX ON metrics
USING mars2_btree (ts, dev_id);
Notes!
The following contents table-level configuration item refer to configuration items that can only be configured using theWITH
clause when creating data tables. They are suitable for single tables and cannot be modified once configured. Global configuration item refer to configuration items that can be configured at the session or system level, and system-level modifications require execution ofmxstop -u
to take effect. For more information, see Data Table Configuration Parameters.
The following parameters are used to control merges (Merge), and the way they affect merges is shown in the Level section above.
Merge Control Parameters | Units | Default Value | Value Range | Description |
---|---|---|---|---|
mars2_automerge_threshold | run | 32 | 10 - 2048 | Used to control all *** MARS2 tables, how many Runs can trigger merges in L0, which is a global configuration item. If you want to specify individually for a form you can use the table option level0_merge_threshold |
level0_merge_threshold | run | 32 | 1 - 2048 | Used to control how many Runs to trigger merges to configure items for table-level configuration |
level0_upgrade_size | MB | 25 | 1 - 10000 | Control Single table L0 -> L1 upgraded size. When L0 merges the result Run exceeds this size, it will be upgraded to L1, and the table-level configuration item is configured. |
level1_upgrade_size | MB | 1000 | 1 - 10000 | ControlSingle tableL1 ->L2 upgrade size. When L1 merges the result Run exceeds this size, it will be upgraded to L2, and the table-level configuration item is configured. |
The following parameters are used for compression control and adjust the compression effect. If the compression effect is not obvious if the configuration is too low, the configuration is too high and consumes more memory.
Compression Control Parameters | Units | Default Value | Value Range | Description |
---|---|---|---|---|
compress_threshold | Tuple | 1200 | 1 - 100000 | Compression threshold. Used to control how many tuples (tuples) are compressed in a single table. It is the upper limit of the number of tuples compressed in the same unit and is a table-level configuration item. |
The following parameters are used for memory control. This part of the parameters controls the size of the sorting memory used when inserting data. When multiple partition tables are inserted, each partition table will allocate the memory configured in mars2_sort_mem_core
; if it is not enough, it will be expanded, but the total amount will not exceed mars2_sort_mem
.
Sorting memory parameters | Units | Default value | Value range | Description |
---|---|---|---|---|
mars2_sort_mem | KB | 2097152KB (2GB) | 128KB - 2147483647KB (~ 2048GB) | Controls the sorted memory size of each ** individual insert. If the insert target table is a partition table, they will share this size and configure the global item |
mars2_sort_mem_core | KB | 16384KB (16MB) | 128KB - 2147483647KB (~ 2048GB) | Control at least how much sorted memory is allocated for each single partition table, and configure the global configuration item |
Example table configuration items:
=# CREATE TABLE metrics (
ts timestamp,
dev_id bigint,
power float,
speed float,
message text
)
USING MARS2
WITH (compress_threshold=1200,level0_merge_threshold=32);
=# CREATE INDEX ON metrics
USING mars2_btree (ts, dev_id);
Example of global configuration items modifying values at the session level:
=# SET mars2_sort_mem TO 2097152;
Example of global configuration items modifying values at the system level:
=# gpconfig -c mars2_sort_mem -v 2097152
=# \q
$ mxstop -u
HEAP is YMatrix's default storage engine, also known as heap storage, inherited from PostgreSQL and only supports row storage, not column storage and compression. It is implemented based on the MVCC mechanism and is suitable for scenarios with a large number of updates and deletion requirements.
Under the influence of the MVCC mechanism, the HEAP table does not really delete the data when handling updates and deletion operations, but only relies on the data version information to block the old data (controls the visibility of the data). Therefore, if the HEAP table is updated or deleted in large quantities, the physical space it consumes will continue to increase, and you need to use VACUUM
regularly to clean up old data in a planned and regular manner.
You can use the following SQL statement to create a HEAP table in YMatrix.
=# CREATE TABLE disk_heap(
time timestamp with time zone,
tag_id int,
read float,
write float
)
DISTRIBUTED BY (tag_id);
Tables with AOCO and AORO as storage engines are collectively called AO (Append-Optimized) tables, also known as append optimization tables, which can support insertion, update and delete operations and support compression. AORO supports row storage, and AOCO supports column storage.
AO tables are very different from HEAP tables in terms of the logical structure and physical structure of the table. As mentioned in the previous section, the HEAP table uses the MVCC mechanism to control the visibility of data after update and deletion operations, while the AO table uses an additional bitmap table to implement it. The content of this table is to indicate which data in the AO table is visible.
For AO tables with a large number of updates and deletion operations, old data also needs to be cleaned regularly in schedule. However, in AO tables, the data cleaning tool vacuum
requires resetting bitmap and compressing physical files, so it is usually slower than HEAP.
Notes!
For storage engine details, usage and best practices, please see [Table Design Best Practices] (/doc/5.1/reference/storage/table_design).