This document describes the deployment, table structure, usage, configuration, and disabling of the YMatrix query statistics feature. When you notice slow database queries or want to better plan database resource utilization, you can use this feature for optimization.
The mxstat query statistics feature includes:
After YMatrix cluster initialization, the matrixmgr database and matrixmgr extension are created by default, and this feature is enabled by default.
To manually enable the feature, run the following commands:
=# createdb matrixmgr
=# psql -d matrixmgr
matrixmgr=# CREATE EXTENSION matrixmgr CASCADE;
NOTICE: installing required extension "matrixts"
CREATE EXTENSION
After successful deployment, the following tables and views appear in the matrixmgr database under the matrixmgr_internal schema:
| View | Description |
|---|---|
| mxstat_execute | Provides execution details of SQL statements, such as the segment instance where the query ran, user, query text, minimum and maximum execution times, and timestamps for each phase of the slowest execution. Use this view to analyze execution times across different SQL statements and examine the slowest queries. |
| mxstat_usage | Provides statistics on compute resource consumption during SQL statement execution. |
| mx_query_execute_history | Historical data from the mxstat_execute view. |
| mx_query_usage_history | Historical data from the mxstat_usage view. |
| mx_querytext | Stores the mapping between queryid and query. Use this mapping to retrieve the original SQL text. |
| mx_query_execute_history_with_text | A joined view of mx_query_execute_history and mx_querytext using queryid. |
For detailed information about each view, see matrixgmr Extension.
The following three examples demonstrate how to progressively examine SQL statement execution and associated resource consumption.
=# SELECT pg_sleep(5);
=# SELECT pg_sleep(10);
=# SELECT pg_sleep(15);
mxstat_execute view for statistics. The results show:sample_start) and end (sample_exec_end) timestamps are recorded.$1.pg_sleep value.matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%pg_sleep%';
seg | userid | dbid | queryid | nestlevel | query | calls_begin | calls_alive | calls_end | total_time | min_time
| max_time | mean_time | stddev_time | sample_planid | sample_start | sample_parse_done |
sample_plan_done | sample_exec_start | sample_exec_end
-----+--------+-------+----------------------+-----------+---------------------+-------------+-------------+-----------+------------+----------
+-----------+--------------------+-------------------+----------------------+-------------------------------+-------------------------------+--
-----------------------------+-------------------------------+-------------------------------
-1 | 10 | 16384 | -2007749946425010549 | 0 | select pg_sleep($1) | 3 | 0 | 3 | 30041 | 5009.054
| 15018.717 | 10013.666666666666 | 4086.427819588182 | -2693056513545111817 | 2022-03-25 13:58:58.503851-04 | 2022-03-25 13:58:58.503933-04 | 2
022-03-25 13:58:58.503994-04 | 2022-03-25 13:58:58.504008-04 | 2022-03-25 13:59:13.522725-04
(1 row)
queryid and query mxstat_usage to view resource usage. The result shows only one record with seg = -1, indicating the query executed only on the Master.matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -2007749946425010549;
seg | userid | dbid | queryid | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+--------+-------+----------------------+-----------+------+-----------------+------------------+---------------------+-------------------
--+----------------+-----------------+--------------------+--------------------+----------------+-------------------+---------------+----------
------+----------+----------+-----------+----------+----------+----------+-----------+-----------+----------+------------+------------+--------
---+-----------+-------------+----------+-----------
-1 | 10 | 16384 | -2007749946425010549 | 0 | 3 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.001297 | 0.000431 | 20568 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 122 | 0
(1 row)
test1 using the following command:=# CREATE TABLE test1(
c1 int,
c2 int
)
USING MARS3
DISTRIBUTED BY(c1)
ORDER BY(c1);
mxstat_execute:matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%create table test1%';
seg | userid | dbid | queryid | nestlevel | query | calls_begin | calls_alive |
calls_end | total_time | min_time | max_time | mean_time | stddev_time | sample_planid | sample_start | sample_parse_done | sa
mple_plan_done | sample_exec_start | sample_exec_end
-----+--------+-------+----------------------+-----------+-------------------------------------------------------+-------------+-------------+-
----------+------------+----------+----------+-----------+-------------+---------------+-------------------------------+-------------------+---
---------------+-------------------------------+-------------------------------
-1 | 10 | 16384 | -6276724884379903029 | 0 | create table test1(c1 int, c2 int) distributed by(c1) | 1 | 0 |
1 | 46.221 | 46.221 | 46.221 | 46.221 | 0 | 0 | 2022-03-25 14:08:51.754458-04 | |
| 2022-03-25 14:08:51.754735-04 | 2022-03-25 14:08:51.800956-04
(1 row)
queryid and query mxstat_usage to view resource usage. Results show entries from all segments, indicating that data is distributed across multiple MXSegment instances.matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -6276724884379903029;
seg | userid | dbid | queryid | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+--------+-------+----------------------+-----------+------+-----------------+------------------+---------------------+-------------------
--+----------------+-----------------+--------------------+--------------------+----------------+-------------------+---------------+----------
------+----------+----------+-----------+----------+----------+----------+-----------+-----------+----------+------------+------------+--------
---+-----------+-------------+----------+-----------
-1 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 295 | 59 | 21 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.004053 | 0 | 22744 | 0 | 0 | 0 | 429 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 6 | 0
1 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 261 | 82 | 19 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.001691 | 0.001558 | 19284 | 0 | 0 | 0 | 510 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 1
0 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 314 | 34 | 19 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.002537 | 0.000193 | 18508 | 0 | 0 | 0 | 574 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 1 | 1
2 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 261 | 82 | 19 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.003043 | 2.9e-05 | 19292 | 0 | 0 | 0 | 514 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 2
(4 rows)
In Examples 1 and 2, the nestlevel value in mxstat_execute is 0 because there was no nested execution. The following example demonstrates nested execution, typically achieved by calling a User Defined Function (UDF).
nest_query that defines the SQL statement SELECT 1;, then invoke it:=# CREATE OR REPLACE FUNCTION nest_query() RETURNS SETOF RECORD
AS $$ SELECT 1;$$
LANGUAGE SQL;
Note!
By default, mxstat does not record nested execution information. You must set the mxstat_statements.track parameter to all.
SELECT $1. The results show:SELECT $1, as constants like 1 are replaced with parameter placeholders.nestlevel value is 1, indicating one level of nesting. Deeper nesting increases this value; matrixmgr records up to level 31.matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%SELECT $1%';
seg | userid | dbid | queryid | nestlevel | query | calls_begin | calls_alive | calls_end | total_time | min_time | max_time
| mean_time | stddev_time | sample_planid | sample_start | sample_parse_done | sample_plan_done
| sample_exec_start | sample_exec_end
-----+--------+-------+----------------------+-----------+-----------+-------------+-------------+-----------+------------+----------+---------
-+-----------+-------------+---------------------+-------------------------------+-------------------------------+-----------------------------
--+-------------------------------+-------------------------------
-1 | 10 | 16384 | -4554727679305370053 | 1 | SELECT $1 | 1 | 0 | 1 | 0.031 | 0.031 | 0.031
| 0.031 | 0 | -382705668420232707 | 2022-03-25 14:35:30.668124-04 | 2022-03-25 14:35:30.668373-04 | 2022-03-25 14:35:30.668403-0
4 | 2022-03-25 14:35:30.668408-04 | 2022-03-25 14:35:30.668439-04
(1 row)
Note!
Disabling query statistics does not require stopping or restarting the database.
$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_stop_worker();"
$ gpconfig -s mxstat_statements.track
$ gpconfig -c mxstat_statements.track -v none
$ mxstop -u
Note!
These two steps temporarily disable the feature. It will be re-enabled by default upon cluster restart. To permanently disable it, run DROP EXTENSION after the above steps.
$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_reset();"
$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_start_worker();"
matrixmgr does not record every individual query. Instead, it aggregates similar queries and presents statistics based on parsed results.
Only queries differing in parameters are grouped together. For example, the following three INSERT statements all insert into the test1 table with different values. They are grouped into one class and share the same queryid.
=# INSERT INTO test1 VALUES(1);
=# INSERT INTO test1 VALUES(2);
=# INSERT INTO test1 VALUES(3);
The following queries differ only in conditional values and are also grouped:
=# SELECT * FROM test1 WHERE c1 = 1;
=# SELECT * FROM test1 WHERE c1 = 2;
=# SELECT * FROM test1 WHERE c1 = 3;
The following queries, although similar, access different tables and are not grouped:
=# SELECT * FROM test1 WHERE c1 = 1;
=# SELECT * FROM test2 WHERE c1 = 1;
=# SELECT * FROM test3 WHERE c1 = 1;
The query statistics feature includes the following parameters to control monitoring behavior.
| Parameter | Type | Description |
|---|---|---|
| mxstat_statements.max | Integer | Number of hash slots for storing query timing data. Represents the maximum number of queries stored in memory. Default is 1000. |
| mxstat_statements.usage_multiple | Integer | Multiplier for hash slots used to store resource usage data relative to timing slots. Default is 2. |
| mxstat_statements.track | String | top: Track only top-level queries, no nesting (default);all: Track all nested levels;none: Disable tracking. |
| mxstat_statements.track_utility | Boolean | Whether to track utility statement execution. Default is on. |
| mxstat_statements.save | Boolean | Whether to dump in-memory statistics to disk during cluster restart. Default is on. |
| mxstat_statements.harvest_interval | Integer | Interval (in minutes) at which statistics are harvested from shared memory into history tables. Default is 5 minutes. |
| mxstat_statements.harvest_usage | Boolean | Whether to collect resource usage statistics. Default is on. |
| mxstat_statements.ignored_databases | String | Comma-separated list of databases to ignore. Default is template0,template1. |
| mxstat_statements.start_reaper_on_start | Boolean | Whether to start the reaper process at cluster startup. This process periodically writes collected data into the mx_query_execute_history, mx_query_usage_history, mx_querytext, mx_query_execute_history_with_text, and related tables. |