Vectorization best practices

This section discusses how to use vectorized execution engine efficiently.

1 Scene

First of all, for different scenarios, the performance improvement of vectorized execution engines is different compared to traditional row-oriented execution engines. You can have a basic understanding and understanding through the overview section.

For example, operators that can operate in full sequential order, such as general expression calculations, selection operators, and aggregation operators, can maximize performance improvements. For operators that introduce randomness, such as sorting operators (Sort) and hash operators (Hash), there is limited room for performance improvement.

2 Data Types

2.1 Use original types whenever possible

You need to use [primitive type] as much as possible (https://zh.wikipedia.org/wiki/%E5%8E%9F%E5%A7%8B%E5%9E%8B%E5%88%A5), such as int32, int64, etc., to improve the performance and efficiency of vectorization.

Generally, enumeration type fields are often used for filtering operations, such as country fields or Chinese province fields. They can be represented by strings, such as "China", "USA", or "Beijing", "Xinjiang", etc., but if you preprocess it and use smallint to represent it, there will be a huge performance improvement for filtering operations.

2.2 Use the type with the smallest length that covers the range of the data range as much as possible

If the age field in the data is sufficient to represent it with smallint, then there is no need to use int32 or even int64 to represent it.

For traditional execution engines, no matter what the numeric type is, the tuple in the final result is often represented by Datum, which occupies 64bits. Datum can represent a single value of any SQL data type, but itself does not contain any information about the data type. Therefore, using different types has no effect on processing speed.

However, for vectorized execution engines, data is continuously and closely arranged. The shorter the value type, the more CPU cache-friendly, and the SIMD registers can store more data. Therefore, filtering or aggregating a certain column, using int64 will be four times slower than int16.

3 Performance analysis and optimization

3.1 Use only vectorized execution engine

Notes!
Due to time and resource constraints, we have not implemented data types, expressions and queries supported by all database standards. However, in order to ensure availability, when we encounter unsupported situations, we will partially or completely use the traditional execution engine for operations, which is reflected in the performance improvement that is not obvious or not improved.

You can manually turn off the matrix.enable_mxv_fallback_expression parameter to determine whether the query uses the traditional execution engine, which is enabled by default.

Parameters Description
matrix.enable_mxv_fallback_expression For expressions that cannot be processed by the current vectorized execution engine, whether to fall back to the traditional execution engine execution is enabled by default
$ SET matrix.enable_mxv_fallback_expression TO off;

When matrix.enable_mxv_fallback_expression is closed and the following error is encountered, it means that the query contains data types, expressions, or queries that the current version of the vectorization engine cannot handle.

$ not implemented yet

3.2 Hash Agg vs. Group Agg

Hash Agg and Group Agg are two specific implementations of aggregation operators, with their own applicable scenarios. Hash Agg uses a hash table to implement aggregation, and Group Agg aggregates the grouped data.

Normally, the optimizer can select the appropriate operator for the query based on the statistical information. However, if the statistics are inaccurate or the query is not optimized enough, a non-optimal path may be selected. Here are two GUCs to help analyze and optimize queries.

Parameters Description
matrix.enable_mxv_aggregate Used to control whether the Group Agg function is enabled, and it is enabled by default
matrix.enable_mxv_hash_aggregate Used to control whether the Hash Agg function is enabled, and it is enabled by default

There are four possibilities in combination.

  • All are turned off, using the aggregation function of the traditional execution engine;
  • All are enabled, and the optimizer selects the best path;
  • Turn on matrix.enable_mxv_aggregate and turn off matrix.enable_mxv_hash_aggregate, then force the implementation of Group Aggregation;
  • Turn on matrix.enable_mxv_hash_aggregate and turn off matrix.enable_mxv_aggregate, then force the Hash Aggregation implementation.

3.3 Vectorization is combined with MARS2 and MARS3 storage engines

3.3.1 Selection of MARS2 and MARS3 sorting keys

In MARS2 and MARS3, data is stored in an orderly manner. In order to maximize the performance improvement brought by utilizing order, it is best to choose fields that are frequently used and have good filtering effects as sort keys (i.e. fields involved in sorting order). For example, in the device monitoring table, the event timestamp and device ID can be used as the sort keys.

The sort keys of MARS2 and MARS3 can only be specified once, cannot be modified, new, and no delete.

The difference is that when MARS2 creates the table:

  • The order of sorting needs to be formulated by creating indexes.

And when MARS3 creates the table:

  • The order of sorting needs to be formulated by specifying the sorting column (can be multiple columns).
  • If the sort key is of text type and can accept sorting in byte order, then using COLLATE C in this column can speed up sorting.
  • SQL keywords that specify the sort key: ORDER BY.

MARS2 can specify a set of sort keys through the following SQL:

=# CREATE TABLE t (
    tag int,
    ts timestamp,
    name text,
    region int
   ) USING MARS2;
=# CREATE INDEX ON t USING mars2_btree (ts, tag);

MARS3 can specify a set of sort keys through the following SQL:

=# CREATE TABLE t (
    tag int,
    ts timestamp,
    name text,
    region int
    ) USING MARS3
    ORDER BY (ts, tag);

For MARS2 and MARS3, the selection of sort keys is crucial to the performance of the query. Ordering data can not only accelerate data scanning and block filtering of the storage engine, but also helps sorting and aggregation.

For example, if we specify that the sort key of a table is (c1, c2), then its internal data is ordered according to (c1, c2). Now we want to execute the following two query statements that only involve sorting operations:

=# SELECT * FROM t ORDER BY c1, c2;
=# SELECT * FROM t ORDER BY c1, c3;

The first query no longer requires sorting, and the vectorization execution engine will optimize the sorting operator. For the second query, c1 is already ordered, so the vectorization execution engine will have special optimizations for this situation, only sorting different c3s of the same c1.

Only aggregation queries are involved:

=# SELECT sum(c1), sum(c2) FROM t GROUP BY c1, c2;
=# SELECT sum(c1), sum(c3) FROM t GROUP BY c1, c3;

If you use Group Agg, it can greatly improve performance similarly to sorting, and you only need to scan the data in sequence for aggregation operations. For Hash Agg, whether the data is ordered has little effect on performance. As mentioned earlier, the vectorized execution engine of sequential operators can maximize its performance, so using Group Agg is often less time-consuming for the above two queries, especially the first one.

In summary, it is necessary to analyze and combine business and select the sort key based on the characteristics of the query (columns that need to be sorted, aggregated, and filtered).

3.3.2 minmax index of MARS2 and Brin index of MARS3

According to the sorting key, the data stored in MARS2 and MARS3 are all ordered, and a continuous and ordered piece of data is called Run. Run's meta information stores the minimum maximum value, which is used to filter when querying. Because it is filtered through meta information, the data itself does not need to be loaded, and its I/O overhead is lower than sequential scanning (visiting the data itself first, and then filtering) and can be filtered faster.

  • MARS2 enables query filtering by enabling minmax index when creating tables. The minimum maximum value will not be recorded by default when creating a table, and it needs to be declared explicitly.
  • MARS3 then implements filtering by creating an additional Brin index**.

The following are examples:

For MARS2 tables, minmax indexes are established on all columns of the example DDL (can be established as needed in actual work).

=# CREATE TABLE t (
    tag int encoding (minmax),
    ts timestamp encoding (minmax),
    name text encoding (minmax),
    region int encoding (minmax)
    ) USING MARS2;
=# CREATE INDEX ON t USING mars2_btree(tag);

For MARS3 tables, create a Brin index example DDL as follows:

=# CREATE INDEX idx_mars3 ON t USING mars3_brin(ts, tag);

We want to query data from a certain area this year from the above MARS2 (or MARS3) table:

=# SELECT * FROM t WHERE ts > '2022-01-01' AND region = 2;

In fact, the minmax index (Brin index) not only enables the storage engine to perform block filtering, but also accelerates the calculation of the vectorized execution engine.

From the perspective of the storage engine, if the maximum value on ts of a block (several rows containing a column of data) is less than 2022-01-01, the storage engine no longer reads this piece, saving I/O resources, and reducing the engine's computing effort. The vectorized execution engine will use minmax (Brin) information to accelerate calculations. For example, if the maximum and minimum values ​​of a certain region are both 2, then the execution engine will only apply the predicate ts > '2022-01-01' to filter out the data that meets the conditions, without performing any Region-related operations to improve performance.

The minmax index (Brin index) and sort keys are set independently, the former can be used for all columns, while the latter tends to select only a few columns. The impact on operators such as filtering and sorting is different, but there are also crossovers, so you can comprehensively consider using these two to give full play to their respective strengths.

3.3.3 Disable index scanning

MARS2 and MARS3 both provide index scanning, but the current version of the vectorized execution engine only connects to the minmax index (Brin index) and sequential scanning, and does not connect to the index scan. Therefore, for some queries, although the switch of the vectorized execution engine is turned on, the vectorized execution engine is not used. The words Custom Scan (sortheapscan) will appear in the plan. At this point, the vectorized execution engine can be used by disabling index scanning.

$ SET enable_indexscan TO off;

MARS2 and MARS3 For more details, please see Storage Engine