The mxstat query statistics module collects information about query execution, including:
The module is included in the matrixmgr extension and is installed by default with the system.
[mxadmin@mdw ~]$ gpconfig -s shared_preload_libraries
Values on all segments are consistent
GUC : shared_preload_libraries
Master value: matrixts,matrixmgr,matrixgate,telemetry,mars
Segment value: matrixts,matrixmgr,matrixgate,telemetry,mars
To view statistics, you must create the matrixmgr database and install the matrixmgr extension within it. (This database and extension are created by default after MatrixDB cluster initialization.)
createdb matrixmgr
psql -d matrixmgr
matrixmgr=# CREATE EXTENSION matrixmgr CASCADE;
NOTICE: installing required extension "matrixts"
CREATE EXTENSION
After successful deployment, the following tables and views appear under the matrixmgr_internal schema in the matrixmgr database:
mx_query_execute_historymx_query_execute_history_with_textmx_query_usage_historymx_querytextmxstat_executemxstat_usageQuery execution information view.
This view displays statistics for queries executed since the last history collection (default interval: 5 minutes). It includes the following fields:
| Column | Type | Description |
|---|---|---|
| seg | integer | Segment ID where the query plan was generated and dispatched |
| userid | oid | User OID |
| dbid | oid | Database OID |
| queryid | bigint | Query ID, generated by the extension to group similar queries |
| nestlevel | integer | Nesting depth |
| query | text | Query text |
| calls_begin | bigint | Number of times the query started |
| calls_alive | bigint | Number of queries running at last collection |
| calls_end | bigint | Number of queries that completed normally |
| total_time | double precision | Total execution time for this query class, in milliseconds |
| min_time | double precision | Minimum execution time for this query class, in milliseconds |
| max_time | double precision | Maximum execution time for this query class, in milliseconds |
| mean_time | double precision | Average execution time for this query class, in milliseconds |
| stddev_time | double precision | Standard deviation of execution time, in milliseconds |
| sample_planid | bigint | Execution plan ID |
| sample_start | timestamp with time zone | Timestamp when the slowest query started |
| sample_parse_done | timestamp with time zone | Timestamp when parsing completed for the slowest query |
| sample_plan_done | timestamp with time zone | Timestamp when planning completed for the slowest query |
| sample_exec_start | timestamp with time zone | Timestamp when execution started for the slowest query |
| sample_exec_end | timestamp with time zone | Timestamp when execution ended for the slowest query |
Query resource consumption information view.
This view displays resource usage statistics for queries executed since the last history collection (default interval: 5 minutes). It includes the following fields:
| Column | Type | Description |
|---|---|---|
| seg | integer | Segment ID where the query executed |
| userid | oid | User OID |
| dbid | oid | Database OID |
| queryid | bigint | Query ID, generated by the extension to group similar queries |
| nestlevel | integer | Nesting depth |
| rows | bigint | Total number of rows retrieved or affected by the statement |
| shared_blks_hit | bigint | Total number of shared block buffer hits |
| shared_blks_read | bigint | Total number of shared blocks read |
| shared_blks_dirtied | bigint | Total number of shared blocks dirtied |
| shared_blks_written | bigint | Total number of shared blocks written |
| local_blks_hit | bigint | Total number of local block buffer hits |
| local_blks_read | bigint | Total number of local blocks read |
| local_blks_dirtied | bigint | Total number of local blocks dirtied |
| local_blks_written | bigint | Total number of local blocks written |
| temp_blks_read | bigint | Total number of temporary blocks read |
| temp_blks_written | bigint | Total number of temporary blocks written |
| blk_read_time | double precision | Total time spent reading blocks, in milliseconds |
| blk_write_time | double precision | Total time spent writing blocks, in milliseconds |
| ru_utime | double precision | User CPU time |
| ru_stime | double precision | System CPU time |
| ru_maxrss | bigint | Physical memory used, including shared libraries, in KB |
| ru_ixrss | bigint | Integrated shared memory size |
| ru_idrss | bigint | Integrated unshared data size |
| ru_isrss | bigint | Integrated unshared stack size |
| ru_minflt | bigint | Number of minor page faults (no I/O required) |
| ru_majflt | bigint | Number of major page faults (I/O required) |
| ru_nswap | bigint | Number of swaps |
| ru_inblock | bigint | Number of input operations initiated |
| ru_oublock | bigint | Number of output operations initiated |
| ru_msgsnd | bigint | Number of messages sent |
| ru_msgrcv | bigint | Number of messages received |
| ru_nsignals | bigint | Number of signals received |
| ru_nvcsw | bigint | Number of voluntary context switches |
| ru_nivcsw | bigint | Number of involuntary context switches |
This is a partitioned table that stores historical data from the mxstat_execute view, collected every 5 minutes by default. Its structure matches mxstat_execute, with an additional ts_bucket column indicating the collection timestamp.
This is a partitioned table that stores historical data from the mxstat_usage view, collected every 5 minutes by default. Its structure matches mxstat_usage, with an additional ts_bucket column indicating the collection timestamp.
Stores the mapping between queryid and querytext. Like other historical tables, it stores data periodically so that historical queries can retrieve their SQL text.
A view that joins mx_query_execute_history and mx_querytext on queryid, allowing retrieval of both historical query statistics and SQL text.
The mxstat module does not record every individual query. Instead, it groups similar queries and aggregates their statistics. Queries are classified based on their parsed structure.
For example, the following queries all insert data into the test1 table with different parameters. They generate the same queryid and are grouped into one class:
INSERT INTO test1 VALUES(1);
INSERT INTO test1 VALUES(2);
INSERT INTO test1 VALUES(3);
The following queries differ only in conditional parameters but are otherwise identical, so they are also grouped into one class:
SELECT * FROM test1 WHERE c1 = 1;
SELECT * FROM test1 WHERE c1 = 2;
SELECT * FROM test1 WHERE c1 = 3;
The following queries, although similar, operate on different tables and are not grouped together:
SELECT * FROM test1 WHERE c1 = 1;
SELECT * FROM test2 WHERE c1 = 1;
SELECT * FROM test3 WHERE c1 = 1;
Below are examples demonstrating how to use mxstat to view query statistics.
Example 1
Execute the following three SQL statements:
select pg_sleep(5);
select pg_sleep(10);
select pg_sleep(15);
Then query the mxstat_execute view:
matrixmgr=# select * from matrixmgr_internal.mxstat_execute where query like '%pg_sleep%';
seg | userid | dbid | queryid | nestlevel | query | calls_begin | calls_alive | calls_end | total_time | min_time
| max_time | mean_time | stddev_time | sample_planid | sample_start | sample_parse_done |
sample_plan_done | sample_exec_start | sample_exec_end
-----+--------+-------+----------------------+-----------+---------------------+-------------+-------------+-----------+------------+----------
+-----------+--------------------+-------------------+----------------------+-------------------------------+-------------------------------+--
-----------------------------+-------------------------------+-------------------------------
-1 | 10 | 16384 | -2007749946425010549 | 0 | select pg_sleep($1) | 3 | 0 | 3 | 30041 | 5009.054
| 15018.717 | 10013.666666666666 | 4086.427819588182 | -2693056513545111817 | 2022-03-25 13:58:58.503851-04 | 2022-03-25 13:58:58.503933-04 | 2
022-03-25 13:58:58.503994-04 | 2022-03-25 13:58:58.504008-04 | 2022-03-25 13:59:13.522725-04
(1 row)
The result shows that the query was called three times. The query text is normalized, with parameter values replaced by $ followed by a parameter number. The total, minimum, maximum, and average execution times match expectations. Timestamps for each phase of the slowest execution (i.e., pg_sleep(15)) are recorded.
Next, check resource usage using the queryid from mxstat_execute:
matrixmgr=# select * from matrixmgr_internal.mxstat_usage where queryid = -2007749946425010549;
seg | userid | dbid | queryid | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+--------+-------+----------------------+-----------+------+-----------------+------------------+---------------------+-------------------
--+----------------+-----------------+--------------------+--------------------+----------------+-------------------+---------------+----------
------+----------+----------+-----------+----------+----------+----------+-----------+-----------+----------+------------+------------+--------
---+-----------+-------------+----------+-----------
-1 | 10 | 16384 | -2007749946425010549 | 0 | 3 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.001297 | 0.000431 | 20568 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 122 | 0
(1 row)
The result shows the query executed only on the master node (seg = -1).
Example 2
Consider another example:
create table test1(c1 int, c2 int) distributed by(c1);
Query mxstat_execute:
matrixmgr=# select * from matrixmgr_internal.mxstat_execute where query like '%create table test1%';
seg | userid | dbid | queryid | nestlevel | query | calls_begin | calls_alive |
calls_end | total_time | min_time | max_time | mean_time | stddev_time | sample_planid | sample_start | sample_parse_done | sa
mple_plan_done | sample_exec_start | sample_exec_end
-----+--------+-------+----------------------+-----------+-------------------------------------------------------+-------------+-------------+-
----------+------------+----------+----------+-----------+-------------+---------------+-------------------------------+-------------------+---
---------------+-------------------------------+-------------------------------
-1 | 10 | 16384 | -6276724884379903029 | 0 | create table test1(c1 int, c2 int) distributed by(c1) | 1 | 0 |
1 | 46.221 | 46.221 | 46.221 | 46.221 | 0 | 0 | 2022-03-25 14:08:51.754458-04 | |
| 2022-03-25 14:08:51.754735-04 | 2022-03-25 14:08:51.800956-04
(1 row)
Check resource usage:
matrixmgr=# select * from matrixmgr_internal.mxstat_usage where queryid = -6276724884379903029;
seg | userid | dbid | queryid | nestlevel | rows | shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_writte
n | local_blks_hit | local_blks_read | local_blks_dirtied | local_blks_written | temp_blks_read | temp_blks_written | blk_read_time | blk_write
_time | ru_utime | ru_stime | ru_maxrss | ru_ixrss | ru_idrss | ru_isrss | ru_minflt | ru_majflt | ru_nswap | ru_inblock | ru_oublock | ru_msgs
nd | ru_msgrcv | ru_nsignals | ru_nvcsw | ru_nivcsw
-----+--------+-------+----------------------+-----------+------+-----------------+------------------+---------------------+-------------------
--+----------------+-----------------+--------------------+--------------------+----------------+-------------------+---------------+----------
------+----------+----------+-----------+----------+----------+----------+-----------+-----------+----------+------------+------------+--------
---+-----------+-------------+----------+-----------
-1 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 295 | 59 | 21 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.004053 | 0 | 22744 | 0 | 0 | 0 | 429 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 6 | 0
1 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 261 | 82 | 19 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.001691 | 0.001558 | 19284 | 0 | 0 | 0 | 510 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 1
0 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 314 | 34 | 19 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.002537 | 0.000193 | 18508 | 0 | 0 | 0 | 574 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 1 | 1
2 | 10 | 16384 | -6276724884379903029 | 0 | 0 | 261 | 82 | 19 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0.003043 | 2.9e-05 | 19292 | 0 | 0 | 0 | 514 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 2
(4 rows)
The result shows resource usage recorded on all segments, as the CREATE TABLE command executes on each segment.
Example 3
In previous examples, nestlevel is 0 because there was no nested execution. The following example demonstrates nested execution, typically achieved via UDFs.
CREATE OR REPLACE FUNCTION nest_query() RETURNS SETOF RECORD
AS $$ SELECT 1;$$
LANGUAGE SQL;
This creates a UDF named nest_query that internally executes SELECT 1. Call the function:
Note: By default, mxstat does not track nested statements. Set
mxstat_statements.trackto'all'.
mxadmin=# SET mxstat_statements.track TO 'all';
SET
mxadmin=# select nest_query();
nest_query
------------
(1)
(1 row)
Now query the internal statement execution information. Since 1 is a constant, the normalized query uses $1:
matrixmgr=# select * from matrixmgr_internal.mxstat_execute where query like '%SELECT $1%';
seg | userid | dbid | queryid | nestlevel | query | calls_begin | calls_alive | calls_end | total_time | min_time | max_time
| mean_time | stddev_time | sample_planid | sample_start | sample_parse_done | sample_plan_done
| sample_exec_start | sample_exec_end
-----+--------+-------+----------------------+-----------+-----------+-------------+-------------+-----------+------------+----------+---------
-+-----------+-------------+---------------------+-------------------------------+-------------------------------+-----------------------------
--+-------------------------------+-------------------------------
-1 | 10 | 16384 | -4554727679305370053 | 1 | SELECT $1 | 1 | 0 | 1 | 0.031 | 0.031 | 0.031
| 0.031 | 0 | -382705668420232707 | 2022-03-25 14:35:30.668124-04 | 2022-03-25 14:35:30.668373-04 | 2022-03-25 14:35:30.668403-0
4 | 2022-03-25 14:35:30.668408-04 | 2022-03-25 14:35:30.668439-04
(1 row)
Here, nestlevel is 1, indicating one level of nesting. Deeper nesting increases this value. mxstat tracks up to 31 nesting levels.
mxstat provides the following GUC parameters to control monitoring behavior:
| Name | Type | Description |
|---|---|---|
| mxstat_statements.max | integer | Number of hash slots for storing query timing data. Default: 5000 |
| mxstat_statements.usage_multiple | integer | Multiplier for hash slots used to store resource usage data relative to timing data. A larger value ensures each query finds its resource record. Default: 2 |
| mxstat_statements.track | string | top: Track only top-level queries (no nesting), default.all: Track all nested queries.none: Disable tracking |
| mxstat_statements.track_utility | boolean | Whether to track utility statement execution. Default: true |
| mxstat_statements.save | boolean | Whether to dump in-memory statistics to disk on cluster restart. Default: true |
| mxstat_statements.harvest_interval | integer | Interval (in seconds) for harvesting statistics from shared memory to history tables. Default: 300 (5 minutes) |