Query statistics

This document mainly introduces the deployment, table structure, usage, configuration and stopping of YMatrix query statistics functions.

The mxstat query statistics module is used to count query information, including:

  • Query usage time
  • Query resource consumption

1 Deployment

This module is included in the matrixmgr extension and will be installed with the system by default.

[mxadmin@mdw ~]$ gpconfig -s shared_preload_libraries
Values on all segments are consistent
GUC          : shared_preload_libraries
Master  value: matrixts,matrixmgr,matrixgate,telemetry,mars2
Segment value: matrixts,matrixmgr,matrixgate,telemetry,mars2

However, to view statistics, you must create a matrixmgr database and create a matrixmgr extension in that database (the YMatrix cluster initializes the database and extensions by default).

createdb matrixmgr
psql -d matrixmgr
matrixmgr=# CREATE EXTENSION matrixmgr CASCADE;
NOTICE:  installing required extension "matrixts"
CREATE EXTENSION

2 Table Structure

After the deployment is successful, the following tables and views will appear in the matrixmgr_internal mode of the matrixmgr database:

  • mx_query_execute_history
  • mx_query_execute_history_with_text
  • mx_query_usage_history
  • mx_querytext
  • mxstat_execute
  • mxstat_usage

2.1 mxstat_execute

Query the execution information view.

This view shows the statistics of the system running the query after the last time the historical information was collected (5 minutes by default). Includes the following fields:

| Field Name | Type | Description | | --- | --- | |seg|integer|Query generates and issues the planned node number| |userid|oid|user OID| |dbid|oid|database OID| |queryid|bigint|query ID, generated by extension, is used to classify queries of the same type| |nestlevel|integer|Nest depth| |query|text|query text| |calls_begin|bigint|Query start times| |calls_alive|bigint|Number of queries in execution state when historical information was last collected| |calls_end|bigint|Number of query normal ending times| |total_time|double precision|Total execution time of this type of query, unit milliseconds| |min_time|double precision|The minimum execution time of this type of query, in milliseconds| |max_time|double precision|Maximum execution time of this type of query, unit milliseconds| |mean_time|double precision|Average execution time of this type of query, in milliseconds| |stddev_time|double precision|Standard deviation of this type of query execution time, unit milliseconds| |sample_planid|bigint|execution plan ID| |sample_start|timestamp with time zone|Slowest query startup timestamp| |sample_parse_done|timestamp with time zone|Slowest query completion parsing timestamp| |sample_plan_done|timestamp with time zone|Slowest query generates plan timestamp| |sample_exec_start|timestamp with time zone|Slowest query starts executing timestamp| |sample_exec_end|timestamp with time zone|Slowest query execution time stamp|

2.2 mxstat_usage

Query the resource consumption information view.

This view shows the resource consumption information that the system has run the query after the last time the historical information is collected (default 5 minutes). Includes the following fields:

| Field Name | Type | Description | | --- | --- | |seg|integer|Query execution node number| |userid|oid|user OID| |dbid|oid|database OID| |queryid|bigint|query ID, generated by extension, is used to classify queries of the same type| |nestlevel|integer|Nest depth| |rows|bigint|Total number of rows retrieved or affected by this statement| |shared_blks_hit|bigint|Total number of shared block buffer hits caused by this statement| |shared_blks_read|bigint|Total number of shared blocks read by this statement| |shared_blks_dirtied|bigint|Total number of shared blocks stained by this statement| |shared_blks_writeten|bigint|Total number of shared blocks written by this statement| |local_blks_hit|bigint|Total number of buffer hits caused by this statement| |local_blks_read|bigint|Total number of local parcels read by this statement| |local_blks_dirtied|bigint|Total number of local parcels stained by this statement| |local_blks_writeten|bigint|Total number of local parcels written to this statement| |temp_blks_read|bigint|Total number of temporary blocks read by this statement| |temp_blks_writeten|bigint|Total number of temporary blocks written by this statement| |blk_read_time|double precision|Total time this statement takes on reading blocks, in milliseconds| |blk_write_time|double precision|Total time this statement spends on writing to a block, in milliseconds| |ru_utime|double precision|user state CPU time| |ru_stime|double precision|system state CPU time| |ru_maxrss|bigint|actually used physical memory, including memory occupied by the shared library, unit KB| |ru_ixrss|bigint|integrated shared memory size| |ru_idrss|bigint|integrated non-shared data size| |ru_isrss|bigint|integrated non-shared stack size| |ru_minflt|bigint|Number of page-missing interrupts, and no I/O is required to process these interrupts| |ru_majflt|bigint|Number of page-missing interrupts, and I/O is required to process these interrupts| |ru_nswap|bigint|switch space| |ru_inblock|bigint|number of times the file system needs to perform input operations| |ru_oublock|bigint|number of times the file system needs to perform output operations| |ru_msgsnd|bigint|number of messages sent| |ru_msgrcv|bigint|number of message reception| |ru_signals|bigint|number of received signals| |ru_nvcsw|bigint|Number of context switching caused by the process voluntarily giving up the processor time slice| |ru_nivcsw|bigint|Number of context switching caused by process time slices being used or being intercepted by high priority processes|

2.3 mx_query_execute_history

This table is a partitioned table, which is a historical information collection of the mxstat_execute view. The default is 5 every minute. Its table structure is consistent with the mxstat_execute view, except that there is an additional ts_bucket field to record the collection time point.

2.4 mx_query_usage_history

This table is a partitioned table, which is a collection of historical information for the mxstat_usage view. The default is 5 every min. Its table structure is consistent with the mxstat_usage view, except that there is an additional ts_bucket field to record the collection time point.

2.5 mx_querytext

Store queryid mappings with querytext, like other historical information collection tables, periodically stores, with the purpose of making history queries get SQL text.

2.6 mx_query_execute_history_with_text

The view after associating mx_query_execute_history with mx_querytext through queryid. Read historical query statistics and SQL text at the same time.

3 Use

3.1 Query classification method

mxstat Statistics query information, not recording every query, but a unified summary of a type of query. Queries are classified according to the parsing results.

The following queries are all inserted into the test1 table, but the parameters are different. These queries will generate the same queryid and be the same category.

INSERT INTO test1 VALUES(1);
INSERT INTO test1 VALUES(2);
INSERT INTO test1 VALUES(3);

The following query is just different in the conditional parameters, and the others are the same, and they will be returned to the same category.

SELECT * FROM test1 WHERE c1 = 1;
SELECT * FROM test1 WHERE c1 = 2;
SELECT * FROM test1 WHERE c1 = 3;

Although the following queries are similar, they are querying different tables and cannot be relocated to the same category.

SELECT * FROM test1 WHERE c1 = 1;
SELECT * FROM test2 WHERE c1 = 1;
SELECT * FROM test3 WHERE c1 = 1;

3.2 Case

Here is an example of how to use mxstat to view query statistics.

Case 1

Execute the following 3 SQL statements:

SELECT pg_sleep(5);
SELECT pg_sleep(10);
SELECT pg_sleep(15);

Then query the statistics of mxstat_execute:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%pg_sleep%';
 seg | userid | dbid  |       queryid        | nestlevel |        query        | calls_begin | calls_alive | calls_end | total_time | min_time
| max_time  |     mean_time      |    stddev_time    |    sample_planid     |         sample_start          |       sample_parse_done       |
     sample_plan_done        |       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------+--------------------+-------------------+----------------------+-------------------------------+-------------------------------+--
----------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -2007749946425010549 |         0 | select pg_sleep($1) |           3 |           0 |         3 |      30041 | 5009.054
| 15018.717 | 10013.666666666666 | 4086.427819588182 | -2693056513545111817 | 2022-03-25 13:58:58.503851-04 | 2022-03-25 13:58:58.503933-04 | 2
022-03-25 13:58:58.503994-04 | 2022-03-25 13:58:58.504008-04 | 2022-03-25 13:59:13.522725-04
(1 row)

As can be seen from the view results, the query was called 3 times, and the query text was also normalized, replacing the time parameter with $ + parameter number. The total time, maximum time, minimum time, and average time of query execution are consistent with expectations. And the time stamp of the slowest query execution stage is recorded, that is, select pg_sleep(15), a query that takes 15 seconds to rest.

Then check the use of the statement resource. This requires the use of mxstat_execute.queryid to associate:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -2007749946425010549;
 seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -2007749946425010549 |         0 |    3 |               0 |                0 |                   0 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.001297 | 0.000431 |     20568 |        0 |        0 |        0 |         6 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |      122 |         0
(1 row)

From the query results, we can see that the query has only been executed on the Master node, so there is only one record of seg = -1.

Case 2

Let's take a look at another example:

CREATE TABLE test1(c1 int, c2 int) DISTRIBUTED BY(c1);

Then query the statistics of mxstat_execute:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%create table test1%';
 seg | userid | dbid  |       queryid        | nestlevel |                         query                         | calls_begin | calls_alive |
calls_end | total_time | min_time | max_time | mean_time | stddev_time | sample_planid |         sample_start          | sample_parse_done | sa
mple_plan_done |       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -6276724884379903029 |         0 | create table test1(c1 int, c2 int) distributed by(c1) |           1 |           0 |
        1 |     46.221 |   46.221 |   46.221 |    46.221 |           0 |             0 | 2022-03-25 14:08:51.754458-04 |                   |
               | 2022-03-25 14:08:51.754735-04 | 2022-03-25 14:08:51.800956-04
(1 row)

Check the usage of this statement resource:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -6276724884379903029;
 seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             295 |               59 |                  21 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.004053 |        0 |     22744 |        0 |        0 |        0 |       429 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        6 |         0
   1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.001691 | 0.001558 |     19284 |        0 |        0 |        0 |       510 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        0 |         1
   0 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             314 |               34 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.002537 | 0.000193 |     18508 |        0 |        0 |        0 |       574 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        1 |         1
   2 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.003043 |  2.9e-05 |     19292 |        0 |        0 |        0 |       514 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        0 |         2
(4 rows)

From the query results, we can see that the query needs to create a table on each node, so all nodes will record the resource usage status.

Case 3 From the above execution results, we can see that in mxstat_execute, the nestlevel values ​​are all 0 because the query execution is not nested. The following is an example of nested execution. Nested execution is usually achieved by calling UDFs.

CREATE OR REPLACE FUNCTION nest_query() RETURNS SETOF RECORD
AS $$ SELECT 1;$$
LANGUAGE SQL;

Create a UDF named nest_query, and SQL: SELECT 1 is called internally. Then call the UDF.

Note: mxstat will not record nested information by default, and mxstat_statements.track needs to be set to 'all'.

mxadmin=# SET mxstat_statements.track TO 'all';
SET
mxadmin=# select nest_query();
 nest_query
------------
 (1)
(1 row)

Then query the execution information of the internal statement SELECT 1. Because 1 is a constant, the normalized query will be replaced with $1:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%SELECT $1%';
 seg | userid | dbid  |       queryid        | nestlevel |   query   | calls_begin | calls_alive | calls_end | total_time | min_time | max_time
 | mean_time | stddev_time |    sample_planid    |         sample_start          |       sample_parse_done       |       sample_plan_done
  |       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-+-----------+-------------+---------------------+-------------------------------+-------------------------------+-----------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -4554727679305370053 |         1 | SELECT $1 |           1 |           0 |         1 |      0.031 |    0.031 |    0.031
 |     0.031 |           0 | -382705668420232707 | 2022-03-25 14:35:30.668124-04 | 2022-03-25 14:35:30.668373-04 | 2022-03-25 14:35:30.668403-0
4 | 2022-03-25 14:35:30.668408-04 | 2022-03-25 14:35:30.668439-04
(1 row)

As you can see, the value of nestlevel this time is 1 because the depth of nested calls is 1. If the nesting is deeper, the value of nestlevel will be larger, and mxstat will be recorded to level 31.

4 Configuration

mxstat contains the following GUCs to control monitoring behavior:

| Name | Type | Description | | --- | --- | | mxstat_statements.max | Integer | Hash slots that store query time. Default 5000 | | mxstat_statements.usage_multiple | Integer | The hash slot that stores query resource information multiples relative to the hash slot in the query time. The reason for a larger reason is that you hope that each query can find the corresponding resource consumption information. Default is 2 | | mxstat_statements.track | string | top: Only trace the first layer, no nested tracking (default).
all: Track all.
none: No tracking is equivalent to turning off the function | | mxstat_statements.track_utility | Boolean | Whether to track the execution of utility statements, default to true | | mxstat_statements.save | Boolean | Whether to dump information in shared memory to a file when the cluster restarts, the default is true | | mxstat_statements.harvest_interval | Integer | Query statistics are in shared memory by default and are collected periodically into the historical table. This GUC is used to set the acquisition cycle. Default is 5 minutes | | mxstat_statements.harvest_usage | Boolean | Whether to collect usage information, default to true | mxstat_statements.ignored_databases | string | ignored database, default value is template0,template1,matrixmgr

5 Stop

Notes!
Stop query statistics function without stopping or restarting the database.

First, stop the background collection process on Master.

$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_stop_worker();"

Secondly, turn off the acquisition function globally and make it take effect.

$ gpconfig -s mxstat_statements.track 
$ gpconfig -c mxstat_statements.track -v none
$ mxstop -u

Finally, clear the collected data.

psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_reset();"

If you need to restart the background acquisition process, execute the following command:

$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_start_worker();"