Query statistics

This document mainly introduces the deployment, table structure, usage, configuration and stopping of YMatrix query statistics functions.

The mxstat query statistics module is used to count query information, including:

Query usage time
Query resource consumption

1 Deployment

This module is included in the matrixmgr extension and will be installed with the system by default.

[mxadmin@mdw ~]$ gpconfig -s shared_preload_libraries
Values on all segments are consistent
GUC          : shared_preload_libraries
Master  value: matrixts,matrixmgr,matrixgate,telemetry,mars2
Segment value: matrixts,matrixmgr,matrixgate,telemetry,mars2

However, to view statistics, you must create a matrixmgr database and create a matrixmgr extension in that database (the YMatrix cluster initializes the database and extensions by default).

createdb matrixmgr
psql -d matrixmgr
matrixmgr=# CREATE EXTENSION matrixmgr CASCADE;
NOTICE:  installing required extension "matrixts"
CREATE EXTENSION

2 Table Structure

After the deployment is successful, the following tables and views will appear in the matrixmgr_internal mode of the matrixmgr database:

mx_query_execute_history
mx_query_execute_history_with_text
mx_query_usage_history
mx_querytext
mxstat_execute
mxstat_usage

2.1 mxstat_execute

Query the execution information view.

This view shows the statistics of the system running the query after the last time the historical information was collected (5 minutes by default). Includes the following fields:

Field Name	Type	Description
seg	integer	Query generates and issues the planned node number
userid	oid	user OID
dbid	oid	database OID
queryid	bigint	query ID, generated by extension, is used to classify queries of the same type
nestlevel	integer	Nest depth
query	text	query text
calls_begin	bigint	Query start times
calls_alive	bigint	Number of queries in execution state when historical information was last collected
calls_end	bigint	Number of query normal ending times
total_time	double precision	Total execution time of this type of query, unit milliseconds
min_time	double precision	The minimum execution time of this type of query, in milliseconds
max_time	double precision	Maximum execution time of this type of query, unit milliseconds
mean_time	double precision	Average execution time of this type of query, in milliseconds
stddev_time	double precision	Standard deviation of this type of query execution time, unit milliseconds
sample_planid	bigint	execution plan ID
sample_start	timestamp with time zone	Slowest query startup timestamp
sample_parse_done	timestamp with time zone	Slowest query completion parsing timestamp
sample_plan_done	timestamp with time zone	Slowest query generates plan timestamp
sample_exec_start	timestamp with time zone	Slowest query starts executing timestamp
sample_exec_end	timestamp with time zone	Slowest query execution time stamp

2.2 mxstat_usage

Query the resource consumption information view.

This view shows the resource consumption information that the system has run the query after the last time the historical information is collected (default 5 minutes). Includes the following fields:

Field Name	Type	Description
seg	integer	Query execution node number
userid	oid	user OID
dbid	oid	database OID
queryid	bigint	query ID, generated by extension, is used to classify queries of the same type
nestlevel	integer	Nest depth
rows	bigint	Total number of rows retrieved or affected by this statement
shared_blks_hit	bigint	Total number of shared block buffer hits caused by this statement
shared_blks_read	bigint	Total number of shared blocks read by this statement
shared_blks_dirtied	bigint	Total number of shared blocks stained by this statement
shared_blks_writeten	bigint	Total number of shared blocks written by this statement
local_blks_hit	bigint	Total number of buffer hits caused by this statement
local_blks_read	bigint	Total number of local parcels read by this statement
local_blks_dirtied	bigint	Total number of local parcels stained by this statement
local_blks_writeten	bigint	Total number of local parcels written to this statement
temp_blks_read	bigint	Total number of temporary blocks read by this statement
temp_blks_writeten	bigint	Total number of temporary blocks written by this statement
blk_read_time	double precision	Total time this statement takes on reading blocks, in milliseconds
blk_write_time	double precision	Total time this statement spends on writing to a block, in milliseconds
ru_utime	double precision	user state CPU time
ru_stime	double precision	system state CPU time
ru_maxrss	bigint	actually used physical memory, including memory occupied by the shared library, unit KB
ru_ixrss	bigint	integrated shared memory size
ru_idrss	bigint	integrated non-shared data size
ru_isrss	bigint	integrated non-shared stack size
ru_minflt	bigint	Number of page-missing interrupts, and no I/O is required to process these interrupts
ru_majflt	bigint	Number of page-missing interrupts, and I/O is required to process these interrupts
ru_nswap	bigint	switch space
ru_inblock	bigint	number of times the file system needs to perform input operations
ru_oublock	bigint	number of times the file system needs to perform output operations
ru_msgsnd	bigint	number of messages sent
ru_msgrcv	bigint	number of message reception
ru_signals	bigint	number of received signals
ru_nvcsw	bigint	Number of context switching caused by the process voluntarily giving up the processor time slice
ru_nivcsw	bigint	Number of context switching caused by process time slices being used or being intercepted by high priority processes

2.3 mx_query_execute_history

This table is a partitioned table, which is a historical information collection of the mxstat_execute view. The default is 5 every minute. Its table structure is consistent with the mxstat_execute view, except that there is an additional ts_bucket field to record the collection time point.

2.4 mx_query_usage_history

This table is a partitioned table, which is a collection of historical information for the mxstat_usage view. The default is 5 every min. Its table structure is consistent with the mxstat_usage view, except that there is an additional ts_bucket field to record the collection time point.

2.5 mx_querytext

Store queryid mappings with querytext, like other historical information collection tables, periodically stores, with the purpose of making history queries get SQL text.

2.6 mx_query_execute_history_with_text

The view after associating mx_query_execute_history with mx_querytext through queryid. Read historical query statistics and SQL text at the same time.

3 Use

3.1 Query classification method

mxstat Statistics query information, not recording every query, but a unified summary of a type of query. Queries are classified according to the parsing results.

The following queries are all inserted into the test1 table, but the parameters are different. These queries will generate the same queryid and be the same category.

INSERT INTO test1 VALUES(1);
INSERT INTO test1 VALUES(2);
INSERT INTO test1 VALUES(3);

The following query is just different in the conditional parameters, and the others are the same, and they will be returned to the same category.

SELECT * FROM test1 WHERE c1 = 1;
SELECT * FROM test1 WHERE c1 = 2;
SELECT * FROM test1 WHERE c1 = 3;

Although the following queries are similar, they are querying different tables and cannot be relocated to the same category.

SELECT * FROM test1 WHERE c1 = 1;
SELECT * FROM test2 WHERE c1 = 1;
SELECT * FROM test3 WHERE c1 = 1;

3.2 Case

Here is an example of how to use mxstat to view query statistics.

Case 1

Execute the following 3 SQL statements:

SELECT pg_sleep(5);
SELECT pg_sleep(10);
SELECT pg_sleep(15);

Then query the statistics of mxstat_execute:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%pg_sleep%';
 seg | userid | dbid  |       queryid        | nestlevel |        query        | calls_begin | calls_alive | calls_end | total_time | min_time
| max_time  |     mean_time      |    stddev_time    |    sample_planid     |         sample_start          |       sample_parse_done       |
     sample_plan_done        |       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------+--------------------+-------------------+----------------------+-------------------------------+-------------------------------+--
----------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -2007749946425010549 |         0 | select pg_sleep($1) |           3 |           0 |         3 |      30041 | 5009.054
| 15018.717 | 10013.666666666666 | 4086.427819588182 | -2693056513545111817 | 2022-03-25 13:58:58.503851-04 | 2022-03-25 13:58:58.503933-04 | 2
022-03-25 13:58:58.503994-04 | 2022-03-25 13:58:58.504008-04 | 2022-03-25 13:59:13.522725-04
(1 row)

As can be seen from the view results, the query was called 3 times, and the query text was also normalized, replacing the time parameter with $ + parameter number. The total time, maximum time, minimum time, and average time of query execution are consistent with expectations. And the time stamp of the slowest query execution stage is recorded, that is, select pg_sleep(15), a query that takes 15 seconds to rest.

Then check the use of the statement resource. This requires the use of mxstat_execute.queryid to associate:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -2007749946425010549;
 seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -2007749946425010549 |         0 |    3 |               0 |                0 |                   0 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.001297 | 0.000431 |     20568 |        0 |        0 |        0 |         6 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |      122 |         0
(1 row)

From the query results, we can see that the query has only been executed on the Master node, so there is only one record of seg = -1.

Case 2

Let's take a look at another example:

CREATE TABLE test1(c1 int, c2 int) DISTRIBUTED BY(c1);

Then query the statistics of mxstat_execute:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%create table test1%';
 seg | userid | dbid  |       queryid        | nestlevel |                         query                         | calls_begin | calls_alive |
calls_end | total_time | min_time | max_time | mean_time | stddev_time | sample_planid |         sample_start          | sample_parse_done | sa
mple_plan_done |       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -6276724884379903029 |         0 | create table test1(c1 int, c2 int) distributed by(c1) |           1 |           0 |
        1 |     46.221 |   46.221 |   46.221 |    46.221 |           0 |             0 | 2022-03-25 14:08:51.754458-04 |                   |
               | 2022-03-25 14:08:51.754735-04 | 2022-03-25 14:08:51.800956-04
(1 row)

Check the usage of this statement resource:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -6276724884379903029;
 seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             295 |               59 |                  21 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.004053 |        0 |     22744 |        0 |        0 |        0 |       429 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        6 |         0
   1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.001691 | 0.001558 |     19284 |        0 |        0 |        0 |       510 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        0 |         1
   0 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             314 |               34 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.002537 | 0.000193 |     18508 |        0 |        0 |        0 |       574 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        1 |         1
   2 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
    0 | 0.003043 |  2.9e-05 |     19292 |        0 |        0 |        0 |       514 |         0 |        0 |          0 |          0 |
 0 |         0 |           0 |        0 |         2
(4 rows)

From the query results, we can see that the query needs to create a table on each node, so all nodes will record the resource usage status.

Case 3 From the above execution results, we can see that in mxstat_execute, the nestlevel values are all 0 because the query execution is not nested. The following is an example of nested execution. Nested execution is usually achieved by calling UDFs.

CREATE OR REPLACE FUNCTION nest_query() RETURNS SETOF RECORD
AS $$ SELECT 1;$$
LANGUAGE SQL;

Create a UDF named nest_query, and SQL: SELECT 1 is called internally. Then call the UDF.

Note: mxstat will not record nested information by default, and mxstat_statements.track needs to be set to 'all'.

mxadmin=# SET mxstat_statements.track TO 'all';
SET
mxadmin=# select nest_query();
 nest_query
------------
 (1)
(1 row)

Then query the execution information of the internal statement SELECT 1. Because 1 is a constant, the normalized query will be replaced with $1:

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%SELECT $1%';
 seg | userid | dbid  |       queryid        | nestlevel |   query   | calls_begin | calls_alive | calls_end | total_time | min_time | max_time
 | mean_time | stddev_time |    sample_planid    |         sample_start          |       sample_parse_done       |       sample_plan_done
  |       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-+-----------+-------------+---------------------+-------------------------------+-------------------------------+-----------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  -1 |     10 | 16384 | -4554727679305370053 |         1 | SELECT $1 |           1 |           0 |         1 |      0.031 |    0.031 |    0.031
 |     0.031 |           0 | -382705668420232707 | 2022-03-25 14:35:30.668124-04 | 2022-03-25 14:35:30.668373-04 | 2022-03-25 14:35:30.668403-0
4 | 2022-03-25 14:35:30.668408-04 | 2022-03-25 14:35:30.668439-04
(1 row)

As you can see, the value of nestlevel this time is 1 because the depth of nested calls is 1. If the nesting is deeper, the value of nestlevel will be larger, and mxstat will be recorded to level 31.

4 Configuration

mxstat contains the following GUCs to control monitoring behavior:

Name	Type	Description
mxstat_statements.max	Integer	Hash slots that store query time. Default `5000`
mxstat_statements.usage_multiple	Integer	The hash slot that stores query resource information multiples relative to the hash slot in the query time. The reason for a larger reason is that you hope that each query can find the corresponding resource consumption information. Default is 2
mxstat_statements.track	string	top: Only trace the first layer, no nested tracking (default). all: Track all. none: No tracking is equivalent to turning off the function
mxstat_statements.track_utility	Boolean	Whether to track the execution of utility statements, default to `true`
mxstat_statements.save	Boolean	Whether to dump information in shared memory to a file when the cluster restarts, the default is `true`
mxstat_statements.harvest_interval	Integer	Query statistics are in shared memory by default and are collected periodically into the historical table. This GUC is used to set the acquisition cycle. Default is `5` minutes
mxstat_statements.harvest_usage	Boolean	Whether to collect usage information, default to `true`
mxstat_statements.ignored_databases	string	ignored database, default value is `template0,template1,matrixmgr`

5 Stop

Notes!
Stop query statistics function without stopping or restarting the database.

First, stop the background collection process on Master.

$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_stop_worker();"

Secondly, turn off the acquisition function globally and make it take effect.

$ gpconfig -s mxstat_statements.track 
$ gpconfig -c mxstat_statements.track -v none
$ mxstop -u

Finally, clear the collected data.

psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_reset();"

If you need to restart the background acquisition process, execute the following command:

$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_start_worker();"

简体中文