Query Statistics

This document describes the deployment, table structure, usage, configuration, and disabling of the YMatrix query statistics feature.

The mxstat query statistics module collects information about executed queries, including:

Query execution time
Query resource consumption

1 Deployment

The module is included in the matrixmgr extension and is installed by default with the system.

$ gpconfig -s shared_preload_libraries
Values on all segments are consistent
GUC          : shared_preload_libraries
Master  value: matrixts,matrixmgr,matrixgate,telemetry,mars2
Segment value: matrixts,matrixmgr,matrixgate,telemetry,mars2

However, to view statistics, you must create the matrixmgr database and install the matrixmgr extension within it. (The YMatrix cluster creates this database and extension by default during initialization.)

=# createdb matrixmgr
=# psql -d matrixmgr
matrixmgr=# CREATE EXTENSION matrixmgr CASCADE;
NOTICE:  installing required extension "matrixts"
CREATE EXTENSION

2 Table Structure

After successful deployment, the following tables and views appear in the matrixmgr_internal schema of the matrixmgr database:

mx_query_execute_history
mx_query_execute_history_with_text
mx_query_usage_history
mx_querytext
mxstat_execute
mxstat_usage

2.1 mxstat_execute

View of query execution information.

This view displays statistics for queries executed since the last history collection (default interval: 5 minutes). It includes the following fields:

Column	Type	Description
seg	integer	Segment ID where the plan was generated and dispatched
userid	oid	User OID
dbid	oid	Database OID
queryid	bigint	Query ID, generated by the extension to group similar queries
nestlevel	integer	Nesting depth
query	text	Query text
calls_begin	bigint	Number of times the query started
calls_alive	bigint	Number of queries running at last collection
calls_end	bigint	Number of queries that completed normally
total_time	double precision	Total execution time for this query class, in milliseconds
min_time	double precision	Minimum execution time for this query class, in milliseconds
max_time	double precision	Maximum execution time for this query class, in milliseconds
mean_time	double precision	Average execution time for this query class, in milliseconds
stddev_time	double precision	Standard deviation of execution time, in milliseconds
sample_planid	bigint	Execution plan ID
sample_start	timestamp with time zone	Timestamp when the slowest query started
sample_parse_done	timestamp with time zone	Timestamp when parsing completed for the slowest query
sample_plan_done	timestamp with time zone	Timestamp when planning completed for the slowest query
sample_exec_start	timestamp with time zone	Timestamp when execution started for the slowest query
sample_exec_end	timestamp with time zone	Timestamp when execution ended for the slowest query

2.2 mxstat_usage

View of query resource consumption information.

This view displays resource usage statistics for queries executed since the last history collection (default interval: 5 minutes). It includes the following fields:

Column	Type	Description
seg	integer	Segment ID where the query executed
userid	oid	User OID
dbid	oid	Database OID
queryid	bigint	Query ID, generated by the extension to group similar queries
nestlevel	integer	Nesting depth
rows	bigint	Total number of rows retrieved or affected by the statement
shared_blks_hit	bigint	Total number of shared buffer hits
shared_blks_read	bigint	Total number of shared blocks read
shared_blks_dirtied	bigint	Total number of shared blocks dirtied
shared_blks_written	bigint	Total number of shared blocks written
local_blks_hit	bigint	Total number of local buffer hits
local_blks_read	bigint	Total number of local blocks read
local_blks_dirtied	bigint	Total number of local blocks dirtied
local_blks_written	bigint	Total number of local blocks written
temp_blks_read	bigint	Total number of temporary blocks read
temp_blks_written	bigint	Total number of temporary blocks written
blk_read_time	double precision	Total time spent reading blocks, in milliseconds
blk_write_time	double precision	Total time spent writing blocks, in milliseconds
ru_utime	double precision	User CPU time, in seconds
ru_stime	double precision	System CPU time, in seconds
ru_maxrss	bigint	Physical memory used, including shared libraries, in KB
ru_ixrss	bigint	Integrated shared memory size
ru_idrss	bigint	Integrated unshared data size
ru_isrss	bigint	Integrated unshared stack size
ru_minflt	bigint	Number of minor page faults (no I/O required)
ru_majflt	bigint	Number of major page faults (I/O required)
ru_nswap	bigint	Number of swaps
ru_inblock	bigint	Number of input operations initiated
ru_oublock	bigint	Number of output operations initiated
ru_msgsnd	bigint	Number of messages sent
ru_msgrcv	bigint	Number of messages received
ru_nsignals	bigint	Number of signals received
ru_nvcsw	bigint	Number of voluntary context switches
ru_nivcsw	bigint	Number of involuntary context switches

2.3 mx_query_execute_history

This is a partitioned table that stores historical data from the mxstat_execute view, collected every 5 minutes by default. Its structure is identical to the mxstat_execute view, except it includes an additional ts_bucket column to record the collection timestamp.

2.4 mx_query_usage_history

This is a partitioned table that stores historical data from the mxstat_usage view, collected every 5 minutes by default. Its structure is identical to the mxstat_usage view, except it includes an additional ts_bucket column to record the collection timestamp.

2.5 mx_querytext

Stores the mapping between queryid and querytext. Like other historical tables, it stores data periodically, enabling retrieval of SQL text for historical queries.

2.6 mx_query_execute_history_with_text

A view that joins mx_query_execute_history and mx_querytext using queryid. It allows simultaneous access to historical query statistics and SQL text.

3 Usage

3.1 Query Categorization

mxstat does not record every individual query. Instead, it groups similar queries and aggregates their statistics. Queries are grouped based on their parsed structure.

For example, the following queries all insert data into the test1 table with different parameters. They generate the same queryid and are grouped into one category.

=# INSERT INTO test1 VALUES(1);
=# INSERT INTO test1 VALUES(2);
=# INSERT INTO test1 VALUES(3);

The following queries differ only in conditional parameters but are otherwise identical, so they are grouped together.

=# SELECT * FROM test1 WHERE c1 = 1;
=# SELECT * FROM test1 WHERE c1 = 2;
=# SELECT * FROM test1 WHERE c1 = 3;

The following queries, although similar, operate on different tables and are not grouped together.

=# SELECT * FROM test1 WHERE c1 = 1;
=# SELECT * FROM test2 WHERE c1 = 1;
=# SELECT * FROM test3 WHERE c1 = 1;

3.2 Examples

Example 1

Execute the following three SQL statements:

=# SELECT pg_sleep(5);
=# SELECT pg_sleep(10);
=# SELECT pg_sleep(15);

Then query the statistics from mxstat_execute:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%pg_sleep%';
 seg | userid | dbid  |       queryid        | nestlevel |        query        | calls_begin | calls_alive | calls_end | total_time | min_time
| max_time  |     mean_time      |    stddev_time    |    sample_planid     |         sample_start          |       sample_parse_done       |
     sample_plan_done        |       sample_exec_start       |        sample_exec_end
-----+--------+-------+----------------------+-----------+---------------------+-------------+-------------+-----------+------------+----------
+-----------+--------------------+-------------------+----------------------+-------------------------------+-------------------------------+--
-----------------------------+-------------------------------+-------------------------------
  -1 |     10 | 16384 | -2007749946425010549 |         0 | select pg_sleep($1) |           3 |           0 |         3 |      30041 | 5009.054
| 15018.717 | 10013.666666666666 | 4086.427819588182 | -2693056513545111817 | 2022-03-25 13:58:58.503851-04 | 2022-03-25 13:58:58.503933-04 | 2
022-03-25 13:58:58.503994-04 | 2022-03-25 13:58:58.504008-04 | 2022-03-25 13:59:13.522725-04
(1 row)

The result shows the query was called three times. The query text is normalized, with the time parameter replaced by $ + 参数序号. The total, minimum, maximum, and average execution times match expectations. Timestamps for each phase of the slowest execution (the 15-second sleep) are recorded.

Now check resource usage for this query, which requires joining with mxstat_execute.queryid:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -2007749946425010549;
 seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+--------+-------+----------------------+-----------+------+-----------------+------------------+---------------------+-------------------
--+----------------+-----------------+--------------------+--------------------+----------------+-------------------+---------------+----------
------+----------+----------+-----------+----------+----------+----------+-----------+-----------+----------+------------+------------+--------
---+-----------+-------------+----------+-----------
  -1 |     10 | 16384 | -2007749946425010549 |         0 |    3 |               0 |                0 |                   0 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.001297 | 0.000431 |     20568 |        0 |        0 |        0 |         6 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |      122 |         0
(1 row)

The result shows this query ran only on the Master node, so only one seg = -1 record exists.

Example 2

Consider another example:

=# CREATE TABLE test1(
  c1 int,
  c2 int
  )
  USING MARS3
  DISTRIBUTED BY(c1)
  ORDER BY(c1);

Query statistics from mxstat_execute:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%create table test1%';
 seg | userid | dbid  |       queryid        | nestlevel |                         query                         | calls_begin | calls_alive |
calls_end | total_time | min_time | max_time | mean_time | stddev_time | sample_planid |         sample_start          | sample_parse_done | sa
mple_plan_done |       sample_exec_start       |        sample_exec_end
-----+--------+-------+----------------------+-----------+-------------------------------------------------------+-------------+-------------+-
----------+------------+----------+----------+-----------+-------------+---------------+-------------------------------+-------------------+---
---------------+-------------------------------+-------------------------------
  -1 |     10 | 16384 | -6276724884379903029 |         0 | create table test1(c1 int, c2 int) distributed by(c1) |           1 |           0 |
        1 |     46.221 |   46.221 |   46.221 |    46.221 |           0 |             0 | 2022-03-25 14:08:51.754458-04 |                   |
               | 2022-03-25 14:08:51.754735-04 | 2022-03-25 14:08:51.800956-04
(1 row)

Check resource usage:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -6276724884379903029;
 seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+--------+-------+----------------------+-----------+------+-----------------+------------------+---------------------+-------------------
--+----------------+-----------------+--------------------+--------------------+----------------+-------------------+---------------+----------
------+----------+----------+-----------+----------+----------+----------+-----------+-----------+----------+------------+------------+--------
---+-----------+-------------+----------+-----------
  -1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             295 |               59 |                  21 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.004053 |        0 |     22744 |        0 |        0 |        0 |       429 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        6 |         0
   1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.001691 | 0.001558 |     19284 |        0 |        0 |        0 |       510 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        0 |         1
   0 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             314 |               34 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.002537 | 0.000193 |     18508 |        0 |        0 |        0 |       574 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        1 |         1
   2 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.003043 |  2.9e-05 |     19292 |        0 |        0 |        0 |       514 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        0 |         2
(4 rows)

The result shows that the query created a table on each node, so resource usage is recorded across all segments.

Example 3

In the above results, the nestlevel values in mxstat_execute are all 0, as there was no nesting. Now consider a nested execution example, typically achieved via UDF calls.

=# CREATE OR REPLACE FUNCTION nest_query() RETURNS SETOF RECORD
AS $$ SELECT 1;$$
LANGUAGE SQL;

Create a UDF named nest_query that internally executes the SQL: SELECT 1. Then invoke the UDF.

Note: mxstat does not track nested statements by default. You must set mxstat_statements.track to 'all'.

mxadmin=# SET mxstat_statements.track TO 'all';
SET
mxadmin=# select nest_query();
 nest_query
------------
 (1)
(1 row)

Now query the execution information for the inner statement SELECT 1. Since 1 is a constant, the normalized query uses $1 as a placeholder:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%SELECT $1%';
 seg | userid | dbid  |       queryid        | nestlevel |   query   | calls_begin | calls_alive | calls_end | total_time | min_time | max_time
 | mean_time | stddev_time |    sample_planid    |         sample_start          |       sample_parse_done       |       sample_plan_done
  |       sample_exec_start       |        sample_exec_end
-----+--------+-------+----------------------+-----------+-----------+-------------+-------------+-----------+------------+----------+---------
-+-----------+-------------+---------------------+-------------------------------+-------------------------------+-----------------------------
--+-------------------------------+-------------------------------
  -1 |     10 | 16384 | -4554727679305370053 |         1 | SELECT $1 |           1 |           0 |         1 |      0.031 |    0.031 |    0.031
 |     0.031 |           0 | -382705668420232707 | 2022-03-25 14:35:30.668124-04 | 2022-03-25 14:35:30.668373-04 | 2022-03-25 14:35:30.668403-0
4 | 2022-03-25 14:35:30.668408-04 | 2022-03-25 14:35:30.668439-04
(1 row)

Here, nestlevel is 1 because the call depth is 1. Deeper nesting increases this value. mxstat tracks up to 31 levels of nesting.

--- SPLIT ---

4 Configuration

mxstat provides the following GUC parameters to control monitoring behavior:

Name	Type	Description
mxstat_statements.max	Integer	Number of hash slots for storing query timing information. Default is `5000`
mxstat_statements.usage_multiple	Integer	Multiplier for the number of hash slots used to store query resource usage, relative to query timing slots. A larger value ensures each query can find its corresponding resource consumption record. Default is 2
mxstat_statements.track	String	Controls tracking level: top: track only top-level queries, no nested tracking (default). all: track all queries including nested ones. none: disable tracking, effectively turning off the feature
mxstat_statements.track_utility	Boolean	Whether to track execution of utility commands. Default is `true`
mxstat_statements.save	Boolean	Whether to dump in-memory statistics from shared memory to disk files upon cluster restart. Default is `true`
mxstat_statements.harvest_interval	Integer	Statistics are kept in shared memory by default and periodically harvested into historical tables. This GUC sets the harvesting interval in minutes. Default is `5`
mxstat_statements.harvest_usage	Boolean	Whether to collect resource usage data. Default is `true`
mxstat_statements.ignored_databases	String	Comma-separated list of databases to ignore. Default is `template0,template1,matrixmgr`

5 Stopping

Note!
Stopping the query statistics feature does not require stopping or restarting the database.

First, stop the background worker process on the Master: $ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_stop_worker();"

Next, disable the collection feature cluster-wide and reload configuration: $ gpconfig -s mxstat_statements.track $ gpconfig -c mxstat_statements.track -v none $ mxstop -u

Finally, clear any already-collected data: $ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_reset();"

To restart the background worker process later, run: $ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_start_worker();"

← Previous

Load Analysis

Automated Partition Management