mars3_brin Index

This document describes the internal mechanics, related functions, and example usage of the mars3_brin index.

Overview

The mars3_brin index is a built-in sparse index in YMatrix, fully named MARS3 Block Range Index. It is a block-range-based indexing structure. Instead of maintaining an index entry for every row, mars3_brin groups data by blocks and maintains summary information for each block range. This index is particularly effective for ordered datasets, such as time-series or continuously ranged data.

+-------------------+
|  BRIN Index       |
+-------------------+
|  Block Range 1    |
|  Block Range 2    |
|  Block Range 3    |
|       ...         |
|  Block Range N    |
+-------------------+

The diagram above illustrates the logical structure of a mars3_brin index. The index consists of multiple block ranges, each serving as an index unit representing a data range. During query execution, it improves performance by skipping block ranges that do not meet the query conditions (e.g., when the queried value does not fall between the min and max values of a block range).

The mars3_brin index is typically created on the sort key specified when creating a MARS3 table, significantly enhancing query performance on MARS3 tables. Using mars3_brin on the sort key is efficient because data is already physically ordered by the sort key upon insertion. When building the index, no additional sorting is required—YMatrix directly stores the in-memory sorted column values along with their corresponding rowid (which records the physical location in memory and is unique) into available index blocks.

Internal Mechanics

The mars3_brin index stores statistical information for each range, such as the min/max and range ID values for each indexed column within the range (see 相关函数). During index scan, YMatrix first uses the min/max values to determine whether tuples in a range may satisfy the query condition. If they do or might, the system retrieves the data in that range for further evaluation; otherwise, it skips to the next range.

YMatrix uses the min/max values to evaluate whether tuples in a given range satisfy the query condition, as follows:

  1. When you create the index, it automatically records the minimum (min) and maximum (max) values for each indexed column in every block range. These values are maintained during data updates.

  2. During query execution, YMatrix compares the min and max values from the query conditions against the min and max values stored in the index:

    • If the query's min value is greater than the index entry's max, or the query's max is less than the index entry's min, the block range is excluded.
    • If the query's min ≤ index max and query's max ≥ index min, the block range may contain matching rows and must be further examined.
  3. For block ranges that pass step 2, YMatrix accesses the actual data rows and evaluates them against the full query condition to confirm matches.

Related Functions

Function Name Syntax Parameters Description
mars3_info_brin SELECT * FROM matrixts_internal.mars3_info_brin('<索引名>'); Index name Retrieves statistical information about a mars3_brin index (see return fields below)

Return fields of the function:

Field Description
segid Segment ID
level Storage level number in MARS3 (0–9)
run Storage unit number at this level in MARS3
range ID of the index range unit (unique identifier)
placeholder Whether a placeholder is reserved for row-store data (true/false)
attnum Number of columns. Regular columns are numbered starting from 1; system columns (e.g., ctid) are negative
allnulls Whether all values are NULL (true/false)
hasnulls Whether any NULL values exist (true/false)
value Minimum and maximum values of the column within this range

Example Usage

Create the matrixts extension.

=# CREATE EXTENSION matrixts;

Create a MARS3 table t with columns c1 and c2.

=# CREATE TABLE t(
c1 int,
c2 int)
USING MARS3
DISTRIBUTED BY (c1)
ORDER BY c1;

Insert 10000 rows of random data.

=# INSERT INTO t (c1, c2)
SELECT FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100)
FROM (
    SELECT generate_series(1, 10000)
) AS dummy;

Create a mars3_brin index on the sort key c1.

=# CREATE INDEX t_index ON t USING mars3_brin(c1);

Vacuum the t table to move MARS3 row-store data into column-store levels, where indexes can be created.

=# VACUUM t;

View detailed information about index t_index.

=# SELECT * FROM matrixts_internal.mars3_info_brin('t_index');
 segid | level | run | range | placeholder | attnum | allnulls | hasnulls |   value
-------+-------+-----+-------+-------------+--------+----------+----------+------------
     0 |     1 |   2 |     0 | f           |      1 | f        | f        | {3 .. 99}
     0 |     1 |   3 |     0 | f           |      1 | f        | f        | {3 .. 51}
     0 |     1 |   3 |     1 | f           |      1 | f        | f        | {51 .. 93}
     0 |     1 |   3 |     2 | f           |      1 | f        | f        | {93 .. 99}
     2 |     1 |   2 |     0 | f           |      1 | f        | f        | {5 .. 73}
     2 |     1 |   3 |     0 | f           |      1 | f        | f        | {5 .. 58}
     2 |     1 |   3 |     1 | f           |      1 | f        | f        | {58 .. 73}
     3 |     1 |   2 |     0 | f           |      1 | f        | f        | {2 .. 96}
     3 |     1 |   3 |     0 | f           |      1 | f        | f        | {2 .. 37}
     3 |     1 |   3 |     1 | f           |      1 | f        | f        | {37 .. 92}
     3 |     1 |   3 |     2 | f           |      1 | f        | f        | {92 .. 96}
     1 |     1 |   2 |     0 | f           |      1 | f        | f        | {0 .. 98}
     1 |     1 |   3 |     0 | f           |      1 | f        | f        | {0 .. 49}
     1 |     1 |   3 |     1 | f           |      1 | f        | f        | {49 .. 83}
     1 |     1 |   3 |     2 | f           |      1 | f        | f        | {83 .. 98}
(15 rows)

See Also

MARS3 Storage Engine