This section discusses how to use vectorized execution engine efficiently.
First of all, for different scenarios, the performance improvement of vectorized execution engines is different compared to traditional row-oriented execution engines. You can have a basic understanding and understanding through the overview section.
For example, operators that can operate in full sequential order, such as general expression calculations, selection operators, and aggregation operators, can maximize performance improvements. For operators that introduce randomness, such as sorting operators (Sort) and hash operators (Hash), there is limited room for performance improvement.
You need to use [primitive type] as much as possible (https://zh.wikipedia.org/wiki/%E5%8E%9F%E5%A7%8B%E5%9E%8B%E5%88%A5), such as int32, int64, etc., to improve the performance and efficiency of vectorization.
Generally, enumeration type fields are often used for filtering operations, such as country fields or Chinese province fields. They can be represented by strings, such as "China", "USA", or "Beijing", "Xinjiang", etc., but if you preprocess it and use smallint to represent it, there will be a huge performance improvement for filtering operations.
If the age field in the data is sufficient to represent it with smallint, then there is no need to use int32 or even int64 to represent it.
For traditional execution engines, no matter what the numeric type is, the tuple in the final result is often represented by Datum, which occupies 64bits. Datum can represent a single value of any SQL data type, but itself does not contain any information about the data type. Therefore, using different types has no effect on processing speed.
However, for vectorized execution engines, data is continuously and closely arranged. The shorter the value type, the more CPU cache-friendly, and the SIMD registers can store more data. Therefore, filtering or aggregating a certain column, using int64 will be four times slower than int16.
Notes!
Due to time and resource constraints, we have not implemented data types, expressions and queries supported by all database standards. However, in order to ensure availability, when we encounter unsupported situations, we will partially or completely use the traditional execution engine for operations, which is reflected in the performance improvement that is not obvious or not improved.
You can manually turn off the matrix.enable_mxv_fallback_expression
parameter to determine whether the query uses the traditional execution engine, which is enabled by default.
Parameters | Description |
---|---|
matrix.enable_mxv_fallback_expression | For expressions that cannot be processed by the current vectorized execution engine, whether to fall back to the traditional execution engine execution is enabled by default |
$ SET matrix.enable_mxv_fallback_expression TO off;
When matrix.enable_mxv_fallback_expression
is closed and the following error is encountered, it means that the query contains data types, expressions, or queries that the current version of the vectorization engine cannot handle.
$ not implemented yet
Hash Agg and Group Agg are two specific implementations of aggregation operators, with their own applicable scenarios. Hash Agg uses a hash table to implement aggregation, and Group Agg aggregates the grouped data.
Normally, the optimizer can select the appropriate operator for the query based on the statistical information. However, if the statistics are inaccurate or the query is not optimized enough, a non-optimal path may be selected. Here are two GUCs to help analyze and optimize queries.
Parameters | Description |
---|---|
matrix.enable_mxv_aggregate | Used to control whether the Group Agg function is enabled, and it is enabled by default |
matrix.enable_mxv_hash_aggregate | Used to control whether the Hash Agg function is enabled, and it is enabled by default |
There are four possibilities in combination.
matrix.enable_mxv_aggregate
and turn off matrix.enable_mxv_hash_aggregate
, then force the implementation of Group Aggregation;matrix.enable_mxv_hash_aggregate
and turn off matrix.enable_mxv_aggregate
, then force the Hash Aggregation implementation.In MARS2 and MARS3, data is stored in an orderly manner. In order to maximize the performance improvement brought by utilizing order, it is best to choose fields that are frequently used and have good filtering effects as sort keys (i.e. fields involved in sorting order). For example, in the device monitoring table, the event timestamp and device ID can be used as the sort keys.
The sort keys of MARS2 and MARS3 can only be specified once, cannot be modified, new, and no delete.
The difference is that when MARS2 creates the table:
And when MARS3 creates the table:
COLLATE C
in this column can speed up sorting.MARS2 can specify a set of sort keys through the following SQL:
=# CREATE TABLE t (
tag int,
ts timestamp,
name text,
region int
) USING MARS2;
=# CREATE INDEX ON t USING mars2_btree (ts, tag);
MARS3 can specify a set of sort keys through the following SQL:
=# CREATE TABLE t (
tag int,
ts timestamp,
name text,
region int
) USING MARS3
ORDER BY (ts, tag);
For MARS2 and MARS3, the selection of sort keys is crucial to the performance of the query. Ordering data can not only accelerate data scanning and block filtering of the storage engine, but also helps sorting and aggregation.
For example, if we specify that the sort key of a table is (c1, c2)
, then its internal data is ordered according to (c1, c2)
. Now we want to execute the following two query statements that only involve sorting operations:
=# SELECT * FROM t ORDER BY c1, c2;
=# SELECT * FROM t ORDER BY c1, c3;
The first query no longer requires sorting, and the vectorization execution engine will optimize the sorting operator. For the second query, c1 is already ordered, so the vectorization execution engine will have special optimizations for this situation, only sorting different c3s of the same c1.
Only aggregation queries are involved:
=# SELECT sum(c1), sum(c2) FROM t GROUP BY c1, c2;
=# SELECT sum(c1), sum(c3) FROM t GROUP BY c1, c3;
If you use Group Agg, it can greatly improve performance similarly to sorting, and you only need to scan the data in sequence for aggregation operations. For Hash Agg, whether the data is ordered has little effect on performance. As mentioned earlier, the vectorized execution engine of sequential operators can maximize its performance, so using Group Agg is often less time-consuming for the above two queries, especially the first one.
In summary, it is necessary to analyze and combine business and select the sort key based on the characteristics of the query (columns that need to be sorted, aggregated, and filtered).
According to the sorting key, the data stored in MARS2 and MARS3 are all ordered, and a continuous and ordered piece of data is called Run. Run's meta information stores the minimum maximum value, which is used to filter when querying. Because it is filtered through meta information, the data itself does not need to be loaded, and its I/O overhead is lower than sequential scanning (visiting the data itself first, and then filtering) and can be filtered faster.
The following are examples:
For MARS2 tables, minmax indexes are established on all columns of the example DDL (can be established as needed in actual work).
=# CREATE TABLE t (
tag int encoding (minmax),
ts timestamp encoding (minmax),
name text encoding (minmax),
region int encoding (minmax)
) USING MARS2;
=# CREATE INDEX ON t USING mars2_btree(tag);
For MARS3 tables, create a Brin index example DDL as follows:
=# CREATE INDEX idx_mars3 ON t USING mars3_brin(ts, tag);
We want to query data from a certain area this year from the above MARS2 (or MARS3) table:
=# SELECT * FROM t WHERE ts > '2022-01-01' AND region = 2;
In fact, the minmax index (Brin index) not only enables the storage engine to perform block filtering, but also accelerates the calculation of the vectorized execution engine.
From the perspective of the storage engine, if the maximum value on ts
of a block (several rows containing a column of data) is less than 2022-01-01
, the storage engine no longer reads this piece, saving I/O resources, and reducing the engine's computing effort.
The vectorized execution engine will use minmax (Brin) information to accelerate calculations. For example, if the maximum and minimum values of a certain region are both 2
, then the execution engine will only apply the predicate ts > '2022-01-01'
to filter out the data that meets the conditions, without performing any Region-related operations to improve performance.
The minmax index (Brin index) and sort keys are set independently, the former can be used for all columns, while the latter tends to select only a few columns. The impact on operators such as filtering and sorting is different, but there are also crossovers, so you can comprehensively consider using these two to give full play to their respective strengths.
MARS2 and MARS3 both provide index scanning, but the current version of the vectorized execution engine only connects to the minmax index (Brin index) and sequential scanning, and does not connect to the index scan. Therefore, for some queries, although the switch of the vectorized execution engine is turned on, the vectorized execution engine is not used. The words Custom Scan (sortheapscan)
will appear in the plan.
At this point, the vectorized execution engine can be used by disabling index scanning.
$ SET enable_indexscan TO off;
MARS2 and MARS3 For more details, please see Storage Engine
Notes!
Currently, the ORCA optimizer only supports direct docking of HEAP and AO tables, and does not support MARS2/3. Working with vectorization, only AOCO tables are supported.
First, manually enable the ORCA optimizer. optimizer
is a session-level parameter. If you re-enter/replace the session, you need to set it again to take effect.
=# SET optimizer TO on;
After turning on the optimizer
parameter and correctly collecting the statistical information of the table, vectorization can automatically connect to the ORCA optimizer to generate the corresponding query plan, for example:
=# EXPLAIN SELECT * FROM t;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..0.00 rows=0 width=20)
One-Time Filter: false
Optimizer: Pivotal Optimizer (GPORCA)
(3 rows)
Notes!
For more ORCA parameters, see Greenplum ORCA Optimizer Configuration Parameters.