This article introduces the internal principles, related functions and example usage of the mars3_brin index.
The mars3_brin index is a built-in [sparse index] (/doc/latest/reference/glossary#sparse index) in YMatrix, and its full name is MARS3 Block Range Index. It represents a block range-based index structure. The mars3_brin index groups data by blocks and maintains a digest value for each block range, rather than maintaining an index for each row. It is suitable for ordered data sets such as time series data or continuous range of data.
+-------------------+
| BRIN Index |
+-------------------+
| Block Range 1 |
| Block Range 2 |
| Block Range 3 |
| ... |
| Block Range N |
+-------------------+
The above figure shows the logical structure of the mars3_brin index. The index consists of multiple block ranges, one block range is an index unit, representing the range of a piece of data. When you perform a query operation, it can improve query efficiency by skipping the range of blocks that do not meet the query conditions (such as between the min
and max
values that are not part of the block range).
The mars3_brin index is usually established as the index key with the sort key specified when creating the MARS3 table, to greatly improve the query performance of the MARS3 table. The reason why using the mars3_brin index on the sort key is that when writing to the MARS3 table, the data has been sorted in order according to the sort key specified by the table. When creating the index, there is no need to reorder the data in the table. The sequence value in the memory and the corresponding rowid
(record the physical location of the data value in memory, which is the unique value) can be stored in the free index block.
The mars3_brin index records the statistical information of each range, such as the min/max
value and corresponding range ID
of each range in each index column (see Related Functions
for details). Index scanning first determines whether the tuple of the range meets the query conditions through the min/max
value. If it is satisfied or possible, obtain the data of the range for further judgment, otherwise scan the next range.
YMatrix uses the min/max
value to determine whether the tuple in a range meets the query conditions, as follows:
First, when you create an index, the index will automatically record the minimum value (min) and maximum value (max) of each index item so that the index is maintained when the data is updated.
Then, during the query process, YMatrix will compare with the minimum and maximum values of the columns involved in the query condition with the minimum and maximum values of the index term. Based on the comparison results, you can determine which index items may meet the query conditions:
YMatrix For index items that may meet the query conditions determined through the second step, access the corresponding data rows and check whether these data rows meet the complete query conditions.
Function Name | Syntax | Parameters | Description |
---|---|---|---|
mars3_info_brin | SELECT * FROM matrixts_internal.mars3_info_brin('<index name>'); |
Index name | Get statistical information of the mars3_brin index (see the return field below for details) |
The return field of this function:
Fields | Description |
---|---|
segid | Segment ID |
level | MARS3 storage hierarchy number (0~9) |
run | MARS3 The memory unit number of this storage layer |
range | ID (unique ID) of index unit range |
placeholder | Whether to reserve placeholders for row storage data (true /false ) |
attnum | Number of columns. Normal columns count from 1; system columns (such as ctid) are negative |
allnulls | Are they all null values (true /false ) |
hasnulls | Whether it contains null values (true /false ) |
value | The minimum and maximum value of this field in the range |
Create a matrixts
extension.
=# CREATE EXTENSION matrixts;
Create a MARS3 table t
containing columns c1
, c2
.
=# CREATE TABLE t(
c1 int,
c2 int)
USING MARS3
DISTRIBUTED BY (c1)
ORDER BY c1;
Write 10000
random data.
=# INSERT INTO t (c1, c2)
SELECT FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100)
FROM (
SELECT generate_series(1, 10000)
) AS dummy;
Create a mars3_brin index on the sort key c1
.
=# CREATE INDEX t_index ON t USING mars3_brin(c1);
Clean the t
table so that the row storage data of MARS3 enters the column storage level to create an index. Indexes can only be created in column storage.
=# VACUUM t;
View the details of the index t_index
.
=# SELECT * FROM matrixts_internal.mars3_info_brin('t_index');
segid | level | run | range | placeholder | attnum | allnulls | hasnulls | value
-------+------+-----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 | 1 | 2 | 0 | f | 1 | f | f | {3 .. 99}
0 | 1 | 3 | 0 | f | 1 | f | f | {3 .. 51}
0 | 1 | 3 | 1 | f | 1 | f | f | {51 .. 93}
0 | 1 | 3 | 2 | f | 1 | f | f | {93 .. 99}
2 | 1 | 2 | 0 | f | 1 | f | f | {5 .. 73}
2 | 1 | 3 | 0 | f | 1 | f | f | {5 .. 58}
2 | 1 | 3 | 1 | f | 1 | f | f | {58 .. 73}
3 | 1 | 2 | 0 | f | 1 | f | f | {2 .. 96}
3 | 1 | 3 | 0 | f | 1 | f | f | {2 .. 37}
3 | 1 | 3 | 1 | f | 1 | f | f | {37 .. 92}
3 | 1 | 3 | 2 | f | 1 | f | f | {92 .. 96}
1 | 1 | 2 | 0 | f | 1 | f | f | {0 .. 98}
1 | 1 | 3 | 0 | f | 1 | f | f | {0 .. 49}
1 | 1 | 3 | 1 | f | 1 | f | f | {49 .. 83}
1 | 1 | 3 | 2 | f | 1 | f | f | {83 .. 98}
(15 rows)