Shows the query plan for the statement.
EXPLAIN [ ( option [, ...] ) ] statement
EXPLAIN [ANALYZE] [VERBOSE] statement
Where option can be one of the following:
ANALYZE [ boolean ]
VERBOSE [ boolean ]
COSTS [ boolean ]
BUFFERS [ boolean ]
TIMING [ boolean ]
FORMAT { TEXT | XML | JSON | YAML }
EXPLAIN displays the query plan generated by the YMatrix or Postgres optimizer for the provided statement. A query plan is a query tree plan for a node. Each node in the plan represents a separate operation such as table scanning, joining, aggregation, or sorting.
The plan should be read from bottom to top, because each node sends a row to the node above it. The lowest-level node of the schedule is usually a table scan operation (sequential scan, index scan, or bitmap index scan). If the query requires connection, aggregation, or sorting (or other operations on the original row), there will be other nodes above the scan node to perform these operations. The top-level plan node is usually the YMatrix database motion node (redistribution, explicit redistribution, broadcast, or collect motion). These operations are responsible for moving rows between segment instances during query processing.
The output of EXPLAIN has a row for each node in the plan tree, showing the basic node type and the following cost estimates made by the planner to execute the plan node:
It is important to note that the cost of the superior node includes the cost of all its child nodes. The highest node of the plan has the estimated total execution cost of the plan. This is the number that plans want to minimize. In particular, the cost does not take into account the time it takes to transfer the result row to the client.
EXPLAIN ANALYZE causes the statement to be actually executed, not just to plan. The EXPLAIN ANALYZE optimizer displays the actual results and the planner's estimates. This is useful for seeing if the optimizer's estimate is close to reality. In addition to the information displayed in the EXPLAIN plan, EXPLAIN ANALYZE will also display the following additional information:
Work_mem used: 64K bytes avg, 64K bytes max (seg0). Work_mem wanted: 90K bytes avg, 90K bytes max (seg0) to abate workfile I/O affecting 2 workers. [seg0] pass 0: 488 groups made from 488 rows; 263 rows written to workfile [seg0] pass 1: 263 groups made from 263 rows
Note: Remember that this statement is actually executed when using ANALYZE. Although EXPLAIN ANALYZE will discard any output that SELECT will return, other side effects of this statement will occur as usual. If you want to use EXPLAIN ANALYZE on a DML statement without letting the command affect your data, use the following method:
BEGIN;
EXPLAIN ANALYZE ...;
ROLLBACK;
Only ANALYZE and VERBOSE options can be specified and can only be specified in that order, without including the option list in parentshes.
ANALYZE
VERBOSE
COSTS
BUFFERS
TIMING
FORMAT
boolean
statement
In order for the query optimizer to make reasonable decisions when optimizing queries, the ANALYZE statement should be run to record statistical information about the distribution of data in the table. If you haven't done this yet (or the statistical distribution of data in the table has changed significantly since the last time you ran ANALYZE), the estimated cost is unlikely to match the actual properties of the query, so you might choose a poor query plan.
SQL statements that are run during the execution of the EXPLAIN ANALYZE command are excluded from the YMatrix database resource queue.
To illustrate how to read an EXPLAIN query plan, consider an example of a very simple query:
EXPLAIN SELECT * FROM names WHERE name = 'Joelle';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..431.27 rows=1 width=58)
-> Seq Scan on names (cost=0.00..431.27 rows=1 width=58)
Filter: (name = 'Joelle'::text)
Optimizer: Pivotal Optimizer (GPORCA) version 3.23.0
(4 rows)
If we read the plan from the bottom up, the query optimizer will start with the sequential scan of the names table. Note that the WHERE clause is used as a filter condition. This means that the scan operation will check the conditions for each row it scans and output only the rows passing the conditions.
The result of the scan operation is passed to the gather motion operation. In the YMatrix database, gather motion is to send segment rows to the master. In this case, we have 3 segment instances sent to 1 master instance (3:1). This operation is performed on slice1 of the parallel query execution plan. In the YMatrix database, the query plan is divided into multiple slices so that the various parts of the query plan can be processed in parallel by These segments.
The estimated start-up cost for the program is 00.00 (no cost) and the total cost is 431.27. The optimizer estimates that this query will return a row.
This is the same query, but the cost estimate is suppressed:
EXPLAIN (COSTS FALSE) SELECT * FROM names WHERE name = 'Joelle';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3)
-> Seq Scan on names
Filter: (name = 'Joelle'::text)
Optimizer: Pivotal Optimizer (GPORCA) version 3.23.0
(4 rows)
Here is the same query using JSON format:
EXPLAIN (FORMAT JSON) SELECT * FROM names WHERE name = 'Joelle';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
[ +
{ +
"Plan": { +
"Node Type": "Gather Motion", +
"Senders": 3, +
"Receivers": 1, +
"Slice": 1, +
"Segments": 3, +
"Gang Type": "primary reader", +
"Startup Cost": 0.00, +
"Total Cost": 431.27, +
"Plan Rows": 1, +
"Plan Width": 58, +
"Plans": [ +
{ +
"Node Type": "Seq Scan", +
"Parent Relationship": "Outer", +
"Slice": 1, +
"Segments": 3, +
"Gang Type": "primary reader", +
"Relation Name": "names", +
"Alias": "names", +
"Startup Cost": 0.00, +
"Total Cost": 431.27, +
"Plan Rows": 1, +
"Plan Width": 58, +
"Filter": "(name = 'Joelle'::text)"+
} +
] +
}, +
"Settings": { +
"Optimizer": "Pivotal Optimizer (GPORCA) version 3.23.0" +
} +
} +
]
(1 row)
If an index exists and we use a query with an indexable WHERE condition, then the EXPLAIN may show a different plan. This query uses YAML format to generate plans with index scans:
EXPLAIN (FORMAT YAML) SELECT * FROM NAMES WHERE LOCATION='Sydney, Australia';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
- Plan: +
Node Type: "Gather Motion" +
Senders: 3 +
Receivers: 1 +
Slice: 1 +
Segments: 3 +
Gang Type: "primary reader" +
Startup Cost: 0.00 +
Total Cost: 10.81 +
Plan Rows: 10000 +
Plan Width: 70 +
Plans: +
- Node Type: "Index Scan" +
Parent Relationship: "Outer" +
Slice: 1 +
Segments: 3 +
Gang Type: "primary reader" +
Scan Direction: "Forward" +
Index Name: "names_idx_loc" +
Relation Name: "names" +
Alias: "names" +
Startup Cost: 0.00 +
Total Cost: 7.77 +
Plan Rows: 10000 +
Plan Width: 70 +
Index Cond: "(location = 'Sydney, Australia'::text)"+
Settings: +
Optimizer: "Pivotal Optimizer (GPORCA) version 3.23.0"
(1 row)
There is no EXPLAIN statement defined in the SQL standard.