This document describes the parameters related to query optimization in the system configuration parameters.
Note!
To ensure system stability and security, please be cautious when manually modifying the relevant parameters.
The query optimization parameters are classified as follows:
This section of configuration parameters provides the original method that influences the query optimizer's selection of query plans.
If the default plan selected by the optimizer for a specific query is not optimal, we can temporarily resolve this issue by using these configuration parameters to force the optimizer to select a better plan.
Better ways to improve the quality of the plan selected by the optimizer include: adjusting the cost parameters of the planner (see below), manually running ANALYZE
, increasing the value of the default_statistics_target
configuration parameter, and using the ALTER TABLE SET STATISTICS
statement to increase the amount of statistics collected for specific columns.
Enable or disable the Postgres query optimizer (planner) to use bitmap scan plan types.
Data Type | Default | Setting Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use group aggregation plan types.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use hash aggregation plan types.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use hash join plan types.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) only use index scan plan types.
Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use index scan plan types.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Whether to enable MARS2 aggscan.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable Postgres query optimizer (planner) materialization.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use merge join plan types.
Data Type | Default | Setting Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) from using nested loop join plans.
Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use parallel append plan types.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use hash join plan types for parallel hashes.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to reduce the ability to remove partitions from partitioned tables in the query plan.
Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use partition-oriented grouping or aggregation.
GROUP BY
clause does not include the partition key, only partial aggregation can be executed on a per-partition basis. Finalization must be performed later.off
.Data Type | Default Value | Setting Category |
---|---|---|
boolean | off | master; session; reload |
Enable or disable the Postgres query optimizer (planner) to use partition-oriented joins.
off
.Data Type | Default Value | Setting Category |
---|---|---|
boolean | off | master; session; reload |
Enable or disable the Postgres query optimizer (planner) from using sequential scan plan types.
Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master;session;reload |
Enable or disable the Postgres query optimizer (planner) from using explicit sort steps.
Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master;session;reload |
Enable or disable the Postgres query optimizer (planner) to use tuple identifier (TID) scan plan types.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable two-stage aggregation to calculate a single distinct aggregation.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Enable or disable three-stage aggregation and joining to calculate different qualified aggregates.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Enable or disable query scheduling for queries targeting data on a single Segment.
Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master;system;restart |
When set to on
, the Postgres query optimizer (planner) plans single-row inserts so that they are sent directly to the correct Segment instance (without motion operations).
Data type | Default value | Setting classification |
---|---|---|
boolean | on | master;session;reload |
Enable or disable the use of two-phase or three-phase parallel aggregation plans in the Postgres query optimizer.
gp_enable_multiphase_agg
is disabled, then gp_enable_agg_distinct
and gp_enable_agg_distinct_pruning
are also disabled.Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Enable two-phase deduplication for SELECT DISTINCT
queries (not SELECT COUNT(DISTINCT)
).
SORT DISTINCT
plan nodes before moving.SORT DISTINCT
is much less costly than sending rows through interconnects.Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
If the table does not have statistics, Greenplum GPORCA optimizer and Postgres query optimizer (planner) are allowed to use the estimated size of the table (pg_relation_size
function).
ANALYZE ROOTPARTITION
to collect statistics for the root partition. See ANALYZE for details.Data Type | Default Value | Setting Category |
---|---|---|
boolean | off | master;session;reload |
Enable removal of duplicates when sorting.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Enable the LIMIT
operation when sorting. Sorting is more efficient when the plan requires at most the first limit_number
rows.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Suppress selectivity in the Join clause.
Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload |
Suppress selectivity in clauses in the same base table.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Set a minimum value (BLOCKS) for the amount of index data that must be scanned. Parallel scanning will only be considered when the amount of index data scanned exceeds this value.
Data Type | Default Value | Value Range | Setting Category |
---|---|---|---|
int | 512 | 0 ~ (INT_MAX/3) | master;session;reload |
Set a minimum value (BLOCKS) for the amount of table data that must be scanned. Parallel scanning will only be considered if the amount of table data scanned exceeds this value.
Data type | Default value | Value range | Setting category |
---|---|---|---|
int | 8192 | 0 ~ (INT_MAX/3) | master;session;reload |
Allow the leader process to execute the query plan under the Gather and Gather Merge nodes instead of waiting for the worker processes.
off
reduces the likelihood of worker processes being blocked due to the leader not reading tuples fast enough, but requires the leader to wait for worker processes to start before producing the first tuple.Data type | Default value | Setting category |
---|---|---|
boolean | on | master;system;restart |
Caution!
Do not adjust the parameters in this section. Changes to these parameters will affect the YMatrix database hardware configuration and typical workloads. All of these parameters are related, and changing the value of one without considering its associated effects will result in a loss of query performance.
Set the cost estimate for the Postgres optimizer (planner) to process each index item in a single index scan. This is measured as part of the sequential page replacement cost.
Data type | Default value | Value range | Setting classification |
---|---|---|---|
floating point | 0.005 | 0 ~ DBL_MAX | master;session;reload |
Set the Postgres optimizer (planner) to estimate the cost of each operator in a WHERE
clause.
Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 0.0025 | 0 ~ DBL_MAX | master;session;reload |
Set the Postgres optimizer (planner) to estimate the cost of processing each row (tuple) in a query. This is measured as part of the sequential page replacement cost.
Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 0.01 | 0 ~ DBL_MAX | master;session;reload |
Set the Postgres query optimizer (planner) estimate of the proportion of rows in a cursor that will be retrieved.
1.0
, at which the cursor will be planned exactly like a regular query, considering only the total estimated time and not how quickly the first few rows are returned.Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 0.1 | 0.0 ~ 1.0 | master;session;reload |
Set the Postgres optimizer's (planner's) assumption about the effective disk buffer size (BLOCKS) available for a single query.
gpconfig
tool and SHOW
command display valid cache size values in “MB” or “kB” units.Data type | Default value | Value range | Setting category |
---|---|---|---|
int | 524288 | 1 ~ INT_MAX | master;session;reload |
Set the optimizer's estimate of the cost of non-sequential disk page fetches from the AO table.
Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 40.0 | 0 ~ DBL_MAX | master;session;reload |
Set the optimizer's estimate of the cost of non-sequential access to disk pages from compressed AO tables.
Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 400.0 | 0 ~ DBL_MAX | master;session;reload |
Set the Postgres query optimizer (planner) cost estimate for the Motion operator to transfer rows from one segment to another, measured in decimal parts of the cost of a contiguous page extraction.
0
, the value used is twice the cpu_tuple_cost
value.Data Type | Default Value | Value Range | Setting Category |
---|---|---|---|
floating point | 0.0 | 0 ~ DBL_MAX | master;session;reload |
Set the number of Primary instances assumed by the Postgres query optimizer (planner) in its cost and size estimates.
0
, the actual number of Primary instances is used.Data Type | Default Value | Value Range | Setting Category |
---|---|---|---|
int | 0 | 0 ~ INT_MAX | master;session;reload |
Set the cost estimate for the Postgres optimizer (planner) when starting parallel worker processes.
Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 1000.0 | 0 ~ INT_MAX | master;session;reload |
Set the Postgres optimizer (planner) cost estimate for passing a tuple from one parallel worker process to another.
Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 0.1 | 0 ~ DBL_MAX | master;session;reload |
Set the cost estimate for non-sequential disk page reads by the Postgres query optimizer (planner).
seq_page_cost
) will cause the system to favor sequential scans, while lower values will cause the system to favor index scans. You can adjust both parameters together to change the relative importance of disk I/O cost compared to CPU cost.random_page_cost
to better reflect the true cost of random storage reads. Conversely, if your data can be entirely cached (e.g., when the database is smaller than the server's total memory), reducing random_page_cost
may be appropriate. For storage with very low random read costs (e.g., solid-state drives), using a lower random_page_cost
value may be more appropriate.Note!
Although you can setrandom_page_cost
to be smaller thanseq_page_cost
, this does not affect the physical reality. When all databases are in memory, setting both values to be equal is reasonable because, in this case, out-of-order fetching does not incur greater overhead than sequential fetching. Similarly, on databases with high cache hit rates, you should reduce both values relative to CPU overhead because fetching pages from memory incurs significantly less overhead than under typical conditions. If the entire database is cached in RAM, setting them to equal values makes sense, as out-of-order page access incurs no additional overhead in this scenario. Similarly, on a highly cached database, you should reduce both values relative to CPU parameters, as retrieving a page already in RAM incurs significantly lower overhead than under typical conditions.
Data Type | Default Value | Value Range | Setting Category |
---|---|---|---|
floating point | 4.0 | 0 ~ DBL_MAX | master;session;reload |
Set the cost of a sequential disk page fetch calculated by the Postgres query optimizer (planner).
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
floating point | 1.0 | 0 ~ DBL_MAX | master; session; reload |
GEQO is an algorithm that uses an exploratory search algorithm for query planning. It can reduce the planning time for complex queries (queries that require multi-table joins), but at the cost of the plans it generates will sometimes be worse than those found using exhaustive search algorithms.
Genetic query optimization is allowed or prohibited.
geqo_threshold
variable provides a finer grained null value for GEQO.Data Type | Default | Set Category |
---|---|---|
boolean | off | master; session; reload |
Controls the balance between planning time and query planning effectiveness in GEQO.
geqo_efffort
does not actually do anything directly; it is simply used to calculate the default values of other variables that affect GEQO behavior.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 0 | 0 ~ INT_MAX | master; session; reload |
Controls the number of child borne used by GEQO.
1
, with the useful range of values and pool sizes the same.0
(default value), YMatrix will select the appropriate value based on geqo_pool_size
.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 0 | 0 ~ INT_MAX | master; session; reload |
Controls the pool size used by GEQO, which is the number of individuals in the genetic population.
geqo_efffort
and the number of tables in the query.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 0 | 0 ~ INT_MAX | master; session; reload |
Controls the initial value of the random number generator used by GEQO, which is used to select a random path in the connection order search space.
0
(default) to 1
. Changing this value changes the set of connection paths being explored and can result in finding a better or worse path.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
floating point | 0.0 | 0.0 ~ 1.0 | master; session; reload |
Controls the selection preferences for GEQO usage.
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
floating point | 1.0 | 0.0 ~ 100.0 | master; session; reload |
Genetic query optimization is only used when the number of FROM
items involved has at least this parameter value (note that a FULL OUTER JOIN
is counted as only one FROM
item).
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 12 | 2 ~ INT_MAX | master; session; reload |
This part of the parameters is used to adjust the amount of data sampled by the ANALYZE
operation. Adjusting these parameters will affect the collection of statistical data in the entire system. You can use the ALTER TABLE SET STATISTICS
clause to configure statistics collection for certain specific tables and columns.
Set default statistical targets for table columns that do not set columns with ALTER TABLE SET STATISTICS
.
ANALYZE
but improve the estimated quality of the Postgres query optimizer.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 100 | 0 ~ 10000 | master; session; reload |
Enable the LIMIT
operation when sorting. Sort is more efficient when the first limit_number
of the line is required at most, sorting is more efficient.
Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable two-stage aggregation to compute a single, distinctly qualified aggregation.
Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable three-stage aggregation and connection to calculate differently qualified aggregations.
Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
Enable or disable the Postgres query optimizer using the two- or three-stage parallel aggregation plan.
gp_enable_multiphase_agg
is off, gp_enable_agg_distinct
and gp_enable_agg_distinct_pruning
are disabled.Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
Enable two-stage duplicate deletion for the SELECT DISTINCT
query (not SELECT COUNT(DISTINCT)
).
SORT DISTINCT
scheduled nodes before moving.SORT DISTINCT
is much cheaper than sending rows over interconnects, where different operations greatly reduce the number of rows.Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
Specifies whether temporary files created when a hash aggregate or hash connection operation overflows to disk.
Data Type | Default | Set Category |
---|---|---|
boolean | off | master; session; reload |
Extract the projected fields and evaluate agg to a separate result node.
Data Type | Default | Set Category |
---|---|---|
boolean | off | segments; session; reload |
If the resulting list does not exceed so many items, the Postgres optimizer (planner) will rewrite the explicit JOIN
(except FULL JOIN
) structure into the FROM
item list.
from_collapse_limit
and is suitable for most cases.1
prevents any reordering of the internal JOINs. Therefore, the explicit join order specified in the query is the actual order in which the relationship is connected.1
and from_collapse_limit
may help trade off planning time with the quality of the selected plan (a higher value will result in a better plan).geqo_threshold
or greater may trigger the use of the GEQO planner, resulting in a non-optimal plan.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 20 | 1 ~ INT_MAX | master; session; reload |
Enable the selection of NULL testing on external connections.
Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
Sets the target density of the hash table used by the HashJoin operation.
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 5 | 1 ~ 25 | master; session; reload |
Specifies whether temporary files created when a hash aggregate or hash connection operation overflows to disk.
Data Type | Default | Set Category |
---|---|---|
boolean | off | master; session; reload |
Controls query The use of table constraints by the Postgres optimizer (planner) to optimize queries.
on
(check constraints on all tables), off
(never check constraints) and partition
(check constraints on inherited subtables and UNION ALL
subquery check constraints). partition
is the default setting. It is often used with traditional inheritance trees to improve performance.=# CREATE TABLE parent(key integer, ...);
=# CREATE TABLE child1000(check (key between 1000 and 1999)) INHERITS(parent);
=# CREATE TABLE child2000(check (key between 2000 and 2999)) INHERITS(parent);
...
=# SELECT * FROM parent WHERE key = 2400;
When constraint exclusion is enabled, this SELECT
will not scan child1000
at all, thereby improving performance.
enable_partition_pruning
))Data type | Default value | Value range | Setting category |
---|---|---|---|
enum | on | true / false / yes / no / 1 / 0 / on / off / partition | master;session;reload |
Parallel queries are permitted for testing purposes, even if performance benefits are not expected.
off
(use parallel mode only when performance improvements are expected), on
(force parallel queries whenever the query is deemed safe), and regress
(similar to on
, but with additional behavior changes as explained below).on
adds a Gather node at the top of any query plan that is safe for parallel queries, so that the query runs within a parallel worker. Even if a parallel worker is unavailable or cannot be used, operations such as starting a transaction that are prohibited in a parallel query context will also be prohibited, unless the optimizer believes that doing so would cause the query to fail (in which case they will not be prohibited). When this option is set, if failures or unexpected results occur, certain functions used by the query may need to be marked as PARALLEL UNSAFE
(or PARALLEL RESTRICTED
).regress
has the same effect as setting it to on
, plus additional effects that aid in automated regression testing. Generally, messages from a parallel worker will include a prompt line indicating this, but setting it to regress
removes this line, making the output identical to non-parallel execution. Similarly, Gather nodes added to the plan by this setting will be hidden in the EXPLAIN
output, resulting in output that matches that produced when set to off
.Data Type | Default Value | Value Range | Setting Category |
---|---|---|---|
enum | off | on / off / regress | master;session;reload |
If the generated FROM
list does not exceed this parameter value, the Postgres query optimizer (planner) will merge the subquery into the parent query.
geqo_threshold
or higher may trigger the use of the GEQO planner, potentially leading to non-optimal plans.Data type | Default value | Valid range | Setting category |
---|---|---|---|
int | 20 | 1 ~ INT_MAX | master;session;reload |
When enabled, the Postgres query optimizer (planner) applies query predicates to two table expressions when the tables are joined on their distribution key columns.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Specify the maximum uncompressed total size of the query execution plan multiplied by the number of Motion operators (slices) in the plan.
0
indicates that the plan size is not monitored.200
is 200kB
. The value 1GB
is equivalent to 1024MB
or 1048576kB
.Data type | Default value | Value range | Setting category |
---|---|---|---|
int | 0 | 0 ~ (INT_MAX/1024) | master;session;superuser |
When the Postgres query optimizer (planner) plans a query on the parent table, it allows the use of statistics from the child table.
Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload |
Once enabled, the Postgres query optimizer (planner) can use foreign key information stored in the system catalog to optimize joins between foreign keys and primary keys.
Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload |
Specifies the maximum number of slices that can be generated by a query (a part of the query plan executed on a Segment instance).
UNION
or UNION ALL
operators across multiple complex views may generate a large number of slices. You can run EXPLAIN ANALYZE
on the query to view the query's slice statistics.Data type | Default value | Value range | Setting category |
---|---|---|---|
int | 0 | 0 ~ INT_MAX | master;session;reload |
Prepared statements (explicitly prepared or implicitly generated, such as PL/pgSQL) can be executed using custom or generic plans.
Data type | Default value | Value range | Setting category |
---|---|---|---|
enum | auto | auto / force_custom_plan / force_generic_plan | master;session;reload |
Does YMatrix allow the use of JIT compilation?
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Set the query cost threshold for enabling JIT (Just-in-Time) compilation. If the query cost exceeds this value, JIT compilation will be enabled [https://www.postgresql.org/docs/12/jit.html].
-1
disables JIT compilation.Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 100000 | -1 ~ DBL_MAX | master;session;reload |
If LLVM has the required functionality, register the generated functions with GDB. This makes debugging easier.
Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload;superuser |
Write the generated LLVM IR to the file system, in the data_directory
.
SET
permissions can change this setting.Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload;superuser |
When JIT compilation is enabled, expressions can be JIT compiled.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
Set the query cost threshold for JIT compilation attempts to inline functions and operators.
jit_above_cost
is meaningless.-1
disables inlining.Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 500000 | -1 ~ DBL_MAX | master;session;reload |
Set the query cost threshold for applying JIT compilation optimizations.
jit_above_cost
is meaningless, and setting it to a value greater than jit_inline_above_cost
may not be beneficial.-1
disables more expensive optimizations.Data type | Default value | Value range | Setting category |
---|---|---|---|
floating point | 500000 | -1 ~ DBL_MAX | master;session;reload |
If LLVM has the required functionality, it emits the data needed for the perf
command to analyze JIT-generated functions.
~/.debug/jit/
. If you have set and loaded the environment variable JITDUMPDIR
, it will instead write to JITDUMPDIR/debug/jit
.Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload;superuser |
When JIT compilation is enabled, allow JIT to compile tuple transformations.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
If the table does not have statistics, the ORCA optimizer and Postgres query optimizer are allowed to use the estimated size of the table (pg_relation_size
function).
ANALYZE ROOTPARTITION
to collect statistics for the root partition. See ANALYZE for details.Data Type | Default Value | Setting Category |
---|---|---|
boolean | off | master;session;reload |
Enable or disable GPORCA when running SQL queries.
on
. If GPORCA is disabled, the YMatrix database will only use the Postgres query optimizer.optimizer
parameter can be set for the database system, a single database, a session, or a query.Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
For partitioned tables, controls whether the ROOTPARTITION
keyword is required when running the ANALYZE
command on the table to collect root partition statistics.
on
, meaning that the ANALYZE
command can collect root partition statistics without using the ROOTPARTITION
keyword. Root partition statistics are collected when running ANALYZE
on the root partition, or when running ANALYZE
on a leaf partition of a partitioned table and other leaf partitions have statistics.off
, you must run ANALYZE ROOTPARTITION
to collect root partition statistics.optimizer
is set to on
(default), this parameter should also be set to on
.Data type | Default value | Setting category |
---|---|---|
boolean | on | master;system;restart |
When enabling GPORCA (default) and processing queries containing predicates with constant arrays, the optimizer_array_expansion_threshold
parameter limits the optimization process based on the number of constants in the array.
IN
clause with more than 100 elements, GPORCA will not convert the predicate to its disjunctive normal form during query optimization to reduce optimization time and memory consumption. The difference in query processing can be observed in the filter conditions of the IN
clause in the query EXPLAIN
plan.Data type | Default value | Value range | Setting category |
---|---|---|---|
int | 100 | 0 ~ INT_MAX | master;session;reload |
Control whether the SET
, RESET
commands or the YMatrix database utility gpconfig
can be used to change the server configuration parameter optimizer
.
optimizer_control
parameter value is enabled, users can set optimizer parameters.optimizer_control
parameter value is off
, optimizer parameters cannot be changed.Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master;system;restart;superuser |
When GPORCA is enabled (default), this parameter controls the cost model that GPORCA selects for bitmap scans, which are used in conjunction with bitmap indexes or B-tree indexes on AO tables.
legacy
- Retains the calibrated
cost model used by GPORCA in Greenplum Database versions 6.13 and earliercalibrated
- Improves cost estimation for indexesexperimental
- Reserved for future experimental cost models; currently equivalent to the calibrated
modelcalibrated
is more likely to choose faster bitmap indexes with nested loop joins rather than hash joins.Data type | Default value | Value range | Setting category |
---|---|---|---|
enum | calibrated | legacy / calibrated / experimental | master;session;reload |
When GPORCA (default) is enabled, this parameter controls the amount of inlining performed for common table expression (CTE) queries (queries containing a WHERE
clause). - The default value 0
disables inlining.
Data type | Default value | Value range | Setting category |
---|---|---|---|
int | 0 | 0 ~ INT_MAX | master;session;reload |
When the Greenplum ORCA optimizer is enabled (default) and this parameter is set to on
(default), GPORCA derives statistics that enable it to more accurately estimate the number of rows to scan during dynamic partition elimination.
Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
When GPORCA (default setting) is enabled, this parameter controls whether to enable join association transformation during query optimization.
off
, only the GPORCA dynamic programming algorithm used to analyze the join order is enabled.on
, GPORCA can use association transformations during query optimization.Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload |
When GPORCA is enabled (default) and this parameter is set to on
(default), GPORCA attempts to execute DML commands such as INSERT
, UPDATE
, and DELETE
.
off
, the Greenplum database always falls back to the Postgres optimizer when executing DML commands.Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master;session;reload |
When GPORCA is enabled (default) and this parameter is set to on (default), GPORCA can generate an index-only scan plan type for B-tree indexes.
off
), GPORCA does not generate index-only scan plan types.Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master;session;reload |
When GPORCA is enabled (default), this parameter allows GPORCA to support multiple different qualified aggregates, such as SELECT count(DISTINCT a),sum(DISTINCT b) FROM foo
.
Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload |
When GPORCA is enabled (default), this parameter affects the query plan alternatives that GPORCA considers when generating a 3-stage aggregation plan.
on
, by default, GPORCA only considers 3-stage aggregation plans, where intermediate aggregation uses GROUP BY
and DISTINCT
columns for distribution to reduce the impact of processing skew.off
, GPORCA can also consider plans that distribute using GROUP BY
columns.Data type | Default value | Setting category |
---|---|---|
boolean | on | master;session;reload |
For the default settings, when GPORCA is enabled and this parameter is set to on, GPORCA selects a 3-stage aggregation plan for scalar-different qualified aggregates when generating such plan alternatives.
Data type | Default value | Setting category |
---|---|---|
boolean | off | master;session;reload |
For the default setting, GPORCA is enabled and this parameter is set to on, GPORCA will select plans with multi-level aggregation when generating such plan alternatives.
Data Type | Default Value | Setting Category |
---|---|---|
boolean | on | master;session;reload |
This value is an optimization hint used to limit the number of join associativity and join commutativity transformations explored during query optimization.
18
is an optimization hint that GPORCA stops exploring join associativity and join commutativity transformations when the n-ary join operator has more than 18 child nodes during optimization.optimizer_join_order
parameter is set to query
or greedy
.Data type | Default value | Value range | Setting category |
---|---|---|---|
int | 18 | 0 ~ INT_MAX | segments;system;reload |
When GPORCA is enabled, this parameter sets the optimization level for join ordering during query optimization by specifying the type of join ordering alternatives to evaluate.
query
- Uses the join order specified in the query.greedy
- Calculates the join order specified in the query and alternative schemes based on the minimum cardinality of the joins.exhaustive
- Applies transformation rules to find and evaluate all alternative join orders.exhaustive
. Setting this parameter to query
or greedy
can generate suboptimal query plans. However, if the administrator is confident that using the query
or greedy
setting generates satisfactory plans, query optimization time can be improved by setting the parameter to a lower optimization level.query
or greedy
overrides the optimizer_join_order_threshold
and optimizer_join_arity_for_associativity_commutativity
parameters.Data type | Default value | Value range | Setting category |
---|---|---|---|
enum | exhaustive | exhaustive / greedy / query | master;session;reload |
When GPORCA is enabled (default setting), this parameter sets the maximum number of join subgraphs that GPORCA will use based on a dynamically programmed join sorting algorithm.
optimizer_join_query
parameter is set to query
or greedy
.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 10 | 0 ~ 12 | master; session; reload |
Sets the maximum amount of memory (KB) used by GPORCA on the YMatrix database master to cache query metadata (optimized data) during query optimization.
optimizer_metadata_caching
.16384 (16MB)
. This is the best value determined by performance analysis.KB
, MB
or GB
. The default unit is KB
. For example, the value 16384
is 16384KB
. The value 1GB
is the same as 1024MB
or 1048576KB
.0
, the cache size is not limited.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 16384 | 0 ~ INT_MAX | master; session; reload |
When GPORCA is enabled (default value), this parameter specifies whether GPORCA caches query metadata (optimized data) in memory on the YMatrix database master during query optimization.
off
, GPORCA does not cache metadata during query optimization.optimizer_mdcache_size
controls the size of the query metadata cache.Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
When GPORCA is enabled (default value), the optimizer_parallel_union
controls the number of parallelizations of queries containing the UNION
or UNION ALL
clause.
off
, the default GPORCA generates a query plan where each child of the APPEND (UNION) operator is in the same slice as the
APPEND` operator.APPEND (UNION)
operator. During query execution, the child and the parent APPEND
operator are on different slices, allowing children of the `APPEND(UNION) operator to be executed in parallel on the Segment instance.Data Type | Default | Set Category |
---|---|---|
boolean | off | master; session; reload |
When GPORCA is enabled (default), GPORCA punishes the cost of attempting to broadcast plans that exceed the value specified by this parameter during query optimization.
100K
row (default), any broadcasts over 100K rows will be severely penalized.0
, GPORCA sets this broadcast threshold to unlimited and never punishes broadcast actions.Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 100000 | 0 ~ INT_MAX | master; session; reload |
When GPORCA is enabled (default), this parameter allows GPORCA to penalize the Segment cost of HashJoin, using the skewed Redisttribute Motion as a child node to support Broadcast Motion during query optimization.
Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
When GPORCA is enabled (default setting), this parameter can record GPORCA query optimization statistics for various optimization phases of the query.
off
, please do not record optimization statistics.on
and the parameter client_min_messages
must be set to log
.set optimizer_print_optimization_stats = on;
set client_min_messages = 'log';
EXPLAIN
or EXPLAIN ANALYZE
command.Data Type | Default | Set Category |
---|---|---|
boolean | off | master; session; reload |
When GPORCA is enabled (default value), optimizer_sort_factor
controls the cost factor applied to the sorting operation during query optimization.
1
Specifies the default sorting cost factor.2.0
sets the cost factor to twice the default value and the value 0.5
sets the factor to half the default value.| Data Type | Default Value | Value Range | Set Classification | | --- | --- | | floating point | 1.0 | 0.0 ~ DBL_MAX | master; session; reload |
Whether to enable Runtime Filter.
Data Type | Default | Set Category |
---|---|---|
boolean | on | master; session; reload |
When the ratio of the number of external table rows to internal table rows is less than this value, the Runtime Filter is not triggered.
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 10 | 0 ~ (INT_MAX/2) | master; session; reload |
The maximum number of times Runtime Filter is used in each query.
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 4 | 0 ~ (INT_MAX/2) | master; session; reload |
When the number of rows in the join table is greater than this value, the Runtime Filter is not triggered.
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 2000000 | 0 ~ (INT_MAX/2) | master; session; reload |
Runtime Filter is not triggered when the number of rows of the joining table is less than this value.
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 3 | 0 ~ (INT_MAX/2) | master; session; reload |
Runtime Filter is not triggered when the number of rows of the joining table is less than this value.
Data Type | Default Value | Value Range | Set Classification |
---|---|---|---|
int | 500000 | 0 ~ (INT_MAX/2) | master; session; reload |
Runtime Filter is not triggered when the ratio of the number of scanned rows in the join table to the number of rows that meet the join conditions is less than this value. | Data Type | Default Value | Value Range | Set Classification | | --- | --- | --- | --- | | int | 25 | 0 ~ (INT_MAX/2) | master; session; reload |
Whether to enable vectorized execution engine.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use group aggregation plan.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable the vectorized execution engine to use hash aggregation plan.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use append plan types.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable the vectorized execution engine to use the Motion plan type.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use aggregation plan type with explicit pre-sorting.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable the root executor of the vectorized execution engine.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use sorting plan types.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable the vectorized execution engine to use strict mode.
Data Type | Default | Set Category |
---|---|---|
boolean | false | master; session; reload |
Whether to enable vectorized execution engines to use fallback expressions.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use restricted plan types.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use window aggregation plan type.
Data Type | Default | Set Category |
---|---|---|
boolean | false | master; session; reload |
Whether to enable the vectorized execution engine to use subquery to scan plan types.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use hash connection plan types.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable advanced hash graphs for vectorized hash joins.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use the aggregated scan plan type.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |
Whether to enable vectorized execution engines to use index scanning.
Data Type | Default | Set Category |
---|---|---|
boolean | true | master; session; reload |