Query statistics

This document mainly introduces the deployment, table structure, usage, configuration and stopping of YMatrix query statistics functions. When you find that database queries are becoming slow, or want to plan the use of database resources more reasonably, you can use this function to achieve optimization.

mxstat query statistics function includes the following contents:

SQL statement execution status, such as query time-consuming, etc.
Statistics of computing resource consumption.

Enable query statistics function

After the YMatrix cluster is initialized, the matrixmgr database and matrixmgr extension will be created by default, and this feature is enabled by default.

To enable it manually, refer to the following command.

=# createdb matrixmgr
=# psql -d matrixmgr
matrixmgr=# CREATE EXTENSION matrixmgr CASCADE;
NOTICE:  installing required extension "matrixts"
CREATE EXTENSION

Available views

After the deployment is successful, the following tables and views will appear in the matrixmgr_internal mode of the matrixmgr database:

View	Description
mxstat_execute	This view provides the execution of SQL statements, such as node instances of SQL statements, users, query text, the longest and shortest time to execute, and the time stamps of each stage of the slowest statement. You can use this table to see the execution time of SQL statements and analyze the slowest statements
mxstat_usage	This view provides the consumption of computing resources when SQL statements are executed
mx_query_execute_history	This view is the historical information of the `mxstat_execute` view
mx_query_usage_history	This view is the historical information of the `mxstat_usage` view
mx_querytext	This view stores a mapping of `queryid` and `querytext`, through which you can get SQL text
mx_query_execute_history_with_text	Views after associating `mx_query_execute_history` with `mx_querytext` via `queryid`

For details on each view, see matrixgmr extension.

Example

The following three examples can help you gradually view the execution of SQL statements and their resources consumed.

Example 1

Execute the following 3 SQL statements to pause the database server process for 5, 10 and 15 seconds respectively:
```
=# SELECT pg_sleep(5);
=# SELECT pg_sleep(10);
=# SELECT pg_sleep(15);
```
Query the statistics in the mxstat_execute view. From the query results, you can see:

The query start (calls_begin) and end (calls_end) are both 3 and the query is called 3 times.
The query text is normalized and the time parameter is replaced with the $ + parameter number.

According to the timestamp of each stage of the slowest query, the two queries were rested for 15 seconds, which corresponds to SELECT pg_sleep(15);.

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%pg_sleep%';
seg | userid | dbid  |       queryid        | nestlevel |        query        | calls_begin | calls_alive | calls_end | total_time | min_time
| max_time  |     mean_time      |    stddev_time    |    sample_planid     |         sample_start          |       sample_parse_done       |
 sample_plan_done        |       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------+--------------------+-------------------+----------------------+-------------------------------+-------------------------------+--
----------------------------------------------------------------------------------------------------------------------------------
-1 |     10 | 16384 | -2007749946425010549 |         0 | select pg_sleep($1) |           3 |           0 |         3 |      30041 | 5009.054
| 15018.717 | 10013.666666666666 | 4086.427819588182 | -2693056513545111817 | 2022-03-25 13:58:58.503851-04 | 2022-03-25 13:58:58.503933-04 | 2
022-03-25 13:58:58.503994-04 | 2022-03-25 13:58:58.504008-04 | 2022-03-25 13:59:13.522725-04
(1 row)

Use queryid to make associations and view the resource usage of the statement through mxstat_usage. From the query results, you can see that there is only one record with seg = -1, which means that the query has only been executed on the Master.

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -2007749946425010549;
seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-1 |     10 | 16384 | -2007749946425010549 |         0 |    3 |               0 |                0 |                   0 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
 0 | 0.001297 | 0.000431 |     20568 |        0 |        0 |        0 |         6 |         0 |        0 |          0 |          0 |
0 |         0 |           0 |      122 |         0
(1 row)

Example 2

Create a test1 table with the following command.

=# CREATE TABLE test1(
c1 int,
c2 int
)
USING MARS3
DISTRIBUTED BY(c1)
ORDER BY(c1);

Query the statistics of mxstat_execute.

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%create table test1%';
seg | userid | dbid  |       queryid        | nestlevel |                         query                         | calls_begin | calls_alive |
calls_end | total_time | min_time | max_time | mean_time | stddev_time | sample_planid |         sample_start          | sample_parse_done | sa
mple_plan_done |       sample_exec_start       |        sample_exec_end
-----+--------+-------+----------------------+-----------+-------------------------------------------------------+-------------+-------------+-
----------+------------+----------+----------+-----------+-------------+---------------+-------------------------------+-------------------+---
---------------+-------------------------------+-------------------------------
-1 |     10 | 16384 | -6276724884379903029 |         0 | create table test1(c1 int, c2 int) distributed by(c1) |           1 |           0 |
     1 |     46.221 |   46.221 |   46.221 |    46.221 |           0 |             0 | 2022-03-25 14:08:51.754458-04 |                   |
            | 2022-03-25 14:08:51.754735-04 | 2022-03-25 14:08:51.800956-04
(1 row)

Use queryid to make associations and view the resource usage of the statement through mxstat_usage. From the query results, we can see that all nodes record resource usage status, which means that the data resource distribution is stored in different data node instances.

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_usage WHERE queryid = -6276724884379903029;
seg | userid | dbid  |       queryid        | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+--------+-------+----------------------+-----------+------+-----------------+------------------+---------------------+-------------------
--+----------------+-----------------+--------------------+--------------------+----------------+-------------------+---------------+----------
------+----------+----------+-----------+----------+----------+----------+-----------+-----------+----------+------------+------------+--------
---+-----------+-------------+----------+-----------
-1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             295 |               59 |                  21 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
 0 | 0.004053 |        0 |     22744 |        0 |        0 |        0 |       429 |         0 |        0 |          0 |          0 |
0 |         0 |           0 |        6 |         0
1 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
 0 | 0.001691 | 0.001558 |     19284 |        0 |        0 |        0 |       510 |         0 |        0 |          0 |          0 |
0 |         0 |           0 |        0 |         1
0 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             314 |               34 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
 0 | 0.002537 | 0.000193 |     18508 |        0 |        0 |        0 |       574 |         0 |        0 |          0 |          0 |
0 |         0 |           0 |        1 |         1
2 |     10 | 16384 | -6276724884379903029 |         0 |    0 |             261 |               82 |                  19 |
0 |              0 |               0 |                  0 |                  0 |              0 |                 0 |             0 |
 0 | 0.003043 |  2.9e-05 |     19292 |        0 |        0 |        0 |       514 |         0 |        0 |          0 |          0 |
0 |         0 |           0 |        0 |         2
(4 rows)

Example 3

In Examples 1 and 2, the nestlevel value in mxstat_execute is 0 because the query execution is not nested. The following is an example of nested execution. Nested execution is usually achieved by calling user-defined functions (UDF).

Create a UDF named nest_query, the function defines the SQL statement SELECT 1, and then calls the UDF.
```
=# CREATE OR REPLACE FUNCTION nest_query() RETURNS SETOF RECORD
AS $$ SELECT 1;$$
LANGUAGE SQL;
```
Notice! mxstat will not record nested information by default, and the mxstat_statements.track parameter needs to be set to 'all'.
Query the execution information of the internal statement SELECT 1. From the returned result you can see:

From the returned result, you can see that the query text is SELECT $, because 1 is a constant, and the normalized query will be replaced with $1.

The value of nestlevel is 1 because the depth of nested calls is 1. If the nesting is deeper, the value of nestlevel will be larger, and matrixmgr will be recorded to level 31.

matrixmgr=# SELECT * FROM matrixmgr_internal.mxstat_execute WHERE query LIKE '%SELECT $1%';
seg | userid | dbid  |       queryid        | nestlevel |   query   | calls_begin | calls_alive | calls_end | total_time | min_time | max_time
| mean_time | stddev_time |    sample_planid    |         sample_start          |       sample_parse_done       |       sample_plan_done
|       sample_exec_start       |        sample_exec_end
-----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-+-----------+-------------+---------------------+-------------------------------+-------------------------------+-----------------------------
--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-1 |     10 | 16384 | -4554727679305370053 |         1 | SELECT $1 |           1 |           0 |         1 |      0.031 |    0.031 |    0.031
|     0.031 |           0 | -382705668420232707 | 2022-03-25 14:35:30.668124-04 | 2022-03-25 14:35:30.668373-04 | 2022-03-25 14:35:30.668403-0
4 | 2022-03-25 14:35:30.668408-04 | 2022-03-25 14:35:30.668439-04
(1 row)

Disable query statistics function

Notes!
Stop query statistics function without stopping or restarting the database.

Stop the background collection process on the Master.

$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_stop_worker();"

Turn off the acquisition function globally and make it take effect.
```
$ gpconfig -s mxstat_statements.track 
$ gpconfig -c mxstat_statements.track -v none
$ mxstop -u
```
Notice!
Using the above two steps only temporarily disables the query statistics function, and starting the cluster again will also be enabled by default. If you want to shut down permanently, you need to do gpconfig -c mxstat_statements.start_reaper_on_start -v false after executing mxstop -u.

Clear the collected data.

$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_reset();"

If you need to restart the background acquisition process, execute the following command. (Optional)
```
$ psql -d matrixmgr -c "SELECT matrixmgr_internal.mxstat_start_worker();"
```
Appendix

Statistical results classification

When matrixmgr counts the execution of SQL statements, it does not record each query, but records a type of query uniformly, and classify the statistical information presented according to the parsing results.

Only SQL statements with different parameters are classified into the same category. As shown in the following example, the three INSERT statements are all inserted into the test1 table, but the parameters are different. These three statements will be classified into the same category and generate the same queryid.

=# INSERT INTO test1 VALUES(1);
=# INSERT INTO test1 VALUES(2);
=# INSERT INTO test1 VALUES(3);

The following query is just different in the conditional parameters, and the others are the same and will be classified into the same category.

=# SELECT * FROM test1 WHERE c1 = 1;
=# SELECT * FROM test1 WHERE c1 = 2;
=# SELECT * FROM test1 WHERE c1 = 3;

Although the following queries are similar, they are querying different tables and cannot be classified into the same category.

=# SELECT * FROM test1 WHERE c1 = 1;
=# SELECT * FROM test2 WHERE c1 = 1;
=# SELECT * FROM test3 WHERE c1 = 1;

Configuration

Query statistics include the following parameters to control monitoring behavior.

Parameters	Type	Description
mxstat_statements.max	Integer	Hash slots that store query time. This parameter represents the maximum number of queries that can be stored in memory, with the default value of `5000`
mxstat_statements.usage_multiple	Integer	Multiple of the hash slot that stores query resource information relative to the hash slot when query time. The default value is `2`
mxstat_statements.track	string	`top`: only track the first layer, no nested tracking (default); `all`: track all; `none`: do not track, which is equivalent to turning off the function
mxstat_statements.track_utility	Boolean	Whether to track the execution of utility statements. Default is `true`
mxstat_statements.save	Boolean	Whether to dump information in shared memory to a file when the cluster restarts. Default is `true`
mxstat_statements.harvest_interval	Integer	Query statistics are in shared memory by default and will be collected periodically into the historical table. This parameter is used to set the acquisition period. Default is `5` minutes
mxstat_statements.harvest_usage	Boolean	Whether to collect usage information. Default is `true`
mxstat_statements.ignored_databases	string	ignored database, default value is `template0,template1,matrixmgr`
mxstat_statements.start_reaper_on_start	Boolean	Decide whether to start the reaper process when starting the cluster. The process periodically stores the collected information into the `mxstat_execute`, `mxstat_usage, mx_query_execute_history`, `mx_query_usage_history`, `mx_querytext`, and `mx_query_execute_history_with_text` tables

简体中文