Query Tuning Parameters

This document describes the system configuration parameters in the Query Tuning category.

Note!
To ensure system stability and security, manually modifying these parameters should be done with caution.

Query tuning parameters are categorized as follows:


Postgres Query Optimizer Operator Control Parameters

These parameters provide basic control over the query optimizer's choice of query plans.

If the default plan chosen by the optimizer for a specific query is suboptimal, you can use these parameters to force the optimizer to select a better plan as a temporary workaround.

A better long-term approach to improve plan quality includes: adjusting the planner's cost parameters (see below), manually running ANALYZE, increasing the value of the default_statistics_target configuration parameter, and using the ALTER TABLE SET STATISTICS statement to increase the amount of statistics collected for specific columns.

enable_bitmapscan


Enables or disables the use of bitmap scan plans by the Postgres query optimizer (planner).

  • Note: This is different from a bitmap index scan. A bitmap scan means the index is dynamically converted into a bitmap in memory when appropriate, improving index performance for complex queries on very large tables. It is used when there are multiple predicates on different indexed columns. Bitmaps for each column can be compared to create the final list of selected tuples.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_groupagg


Enables or disables the use of group aggregate plans by the Postgres query optimizer (planner).

Data Type Default Value Setting Scope
boolean on master; session; reload

enable_hashagg


Enables or disables the use of hash aggregate plans by the Postgres query optimizer (planner).

Data Type Default Value Setting Scope
boolean on master; session; reload

enable_hashjoin


Enables or disables the use of hash join plans by the Postgres query optimizer (planner).

Data Type Default Value Setting Scope
boolean on master; session; reload

enable_indexonlyscan


Enables or disables the use of index-only scan plans by the Postgres query optimizer (planner).

Data Type Default Value Setting Scope
boolean on master; session; reload

enable_indexscan


Enables or disables the use of index scan plans by the Postgres query optimizer (planner).

Data Type Default Value Setting Scope
boolean on master; session; reload

enable_mars2aggscan


Enables or disables MARS2 aggscan.

Data Type Default Value Setting Scope
boolean on master; session; reload

enable_material


Enables or disables the use of materialization by the Postgres query optimizer (planner).

  • Materialization cannot be completely disabled. However, turning off this parameter prevents the planner from inserting materialization nodes unless required for correctness.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_mergejoin


Enables or disables the use of merge join plans by the Postgres query optimizer (planner).

  • Merge join is based on sorting both left and right tables and scanning them in parallel. Therefore, both data types must be fully sortable, and the join operator must only match values that are in the same position in the sort order. In practice, this means the join operator must behave like equality.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_nestloop


Enables or disables the use of nested loop join plans by the Postgres query optimizer (planner).

  • Nested loop joins cannot be completely suppressed. However, if other methods are available, disabling this parameter prevents the Postgres optimizer from using them.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_parallel_append


Enables or disables the use of parallel append plans by the Postgres query optimizer (planner).

Data Type Default Value Setting Scope
boolean on master; session; reload

enable_parallel_hash


Enables or disables the use of hash join plans with parallel hash by the Postgres query optimizer (planner).

  • This parameter has no effect if hash join plans are not enabled.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_partition_pruning


Enables or disables the ability of the Postgres query optimizer (planner) to reduce partitions in partitioned tables from the query plan.

  • This also controls the planner's ability to generate query plans that allow the execution engine to remove (ignore) partitions during query execution. See Partition Pruning.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_partitionwise_aggregate


Enables or disables partition-wise grouping or aggregation by the Postgres query optimizer (planner).

  • This allows grouping or aggregation on partitioned tables to be performed separately on each partition.
  • If the GROUP BY clause does not include the partition key, only partial aggregation can be executed per partition. Finalization must occur later.
  • Because partition-wise grouping or aggregation consumes significant CPU and memory during planning, the default is off.
Data Type Default Value Setting Scope
boolean off master; session; reload

enable_partitionwise_join


Enables or disables partition-wise joins by the Postgres query optimizer (planner).

  • This allows joins between partitioned tables to be executed by joining matching partitions.
  • Partition-wise joins are currently only applicable when join conditions include all partition keys, the data types are identical, and the subpartition sets exactly match.
  • Because partition-wise join planning consumes significant CPU and memory during planning, the default is off.
Data Type Default Value Setting Scope
boolean off master; session; reload

enable_seqscan


Enables or disables the use of sequential scan plans by the Postgres query optimizer (planner).

  • Sequential scans cannot be completely suppressed. However, if other methods are available, disabling this parameter prevents the Postgres optimizer from using them.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_sort


Enables or disables explicit sort steps in plans generated by the Postgres query optimizer (planner).

  • It is not possible to completely prevent explicit sorting. However, if other methods are available, disabling this parameter prevents the Postgres optimizer from using them.
Data Type Default Value Setting Scope
boolean on master; session; reload

enable_tidscan


Enables or disables the use of tuple identifier (TID) scan plans by the Postgres query optimizer (planner).

Data Type Default Value Setting Scope
boolean on master; session; reload

gp_enable_agg_distinct


Enables or disables two-stage aggregation to compute single DISTINCT-aggregate functions.

  • This applies only to subqueries containing a single DISTINCT-aggregate function.
Data Type Default Value Setting Scope
boolean on master; session; reload

gp_enable_agg_distinct_pruning


Enables or disables three-stage aggregation and join to compute DISTINCT-aggregate functions.

  • This applies only to subqueries containing one or more DISTINCT-aggregate functions.
Data Type Default Value Setting Scope
boolean on master; session; reload

gp_enable_direct_dispatch


Enables or disables direct dispatch of query plans to target segments for queries accessing data on a single segment.

  • When enabled, queries targeting rows on a single segment dispatch the plan only to that segment (not all segments). This significantly reduces response time for such queries by avoiding interconnect overhead.
  • Direct dispatch increases CPU usage on the Master.
Data Type Default Value Setting Scope
boolean on master; system; restart

gp_enable_fast_sri


When set to on, the Postgres query optimizer (planner) plans single-row inserts to be sent directly to the correct segment instance (without motion operations).

  • This significantly improves performance for single-row INSERT statements.
Data Type Default Value Setting Scope
boolean on master; session; reload

gp_enable_multiphase_agg


Enables or disables the use of two-stage or three-stage parallel aggregation plans by the Postgres query optimizer.

  • This applies to any subquery with aggregation. If gp_enable_multiphase_agg is off, gp_enable_agg_distinct and gp_enable_agg_distinct_pruning are disabled.
Data Type Default Value Setting Scope
boolean on master; session; reload

gp_enable_preunique


Enables two-stage deduplication (not SELECT COUNT(DISTINCT)) for SELECT DISTINCT queries.

  • When enabled, it adds an extra set of SORT DISTINCT plan nodes before motion.
  • When DISTINCT significantly reduces the number of rows, the cost of this extra SORT DISTINCT is much lower than sending rows over the interconnect.
Data Type Default Value Setting Scope
boolean on master; session; reload

gp_enable_relsize_collection


Allows the Greenplum GPORCA optimizer and Postgres query optimizer (planner) to use estimated table size (pg_relation_size function) when table statistics are unavailable.

  • By default, if statistics are not available, both the GPORCA optimizer and planner use default values to estimate row count.
  • The default behavior may improve query optimization time and reduce resource queue usage under heavy workloads, but could lead to suboptimal plans.
  • This parameter is ignored for root partitions of partitioned tables.
  • When this optimizer is enabled and the root partition lacks statistics, default values are always used. You can use ANALZYE ROOTPARTITION to collect statistics on the root partition. See ANALYZE.
Data Type Default Value Setting Scope
boolean off master; session; reload

gp_enable_sort_distinct


Enables duplicate removal during sorting.

Data Type Default Value Setting Scope
boolean on master; session; reload

gp_enable_sort_limit


Enables LIMIT operation during sorting. Sorting is more efficient when the plan only needs the first limit_number rows.

Data Type Default Value Setting Scope
boolean on master; session; reload

gp_selectivity_damping_for_joins


Dampens selectivity estimates in join clauses.

Data Type Default Value Setting Scope
boolean off master; session; reload

gp_selectivity_damping_for_scans


Dampens selectivity estimates in clauses on the same base table.

Data Type Default Value Setting Scope
boolean on master; session; reload

min_parallel_index_scan_size


Sets the minimum amount of index data (in BLOCKS) that must be scanned before a parallel index scan is considered.

  • Note: A parallel index scan does not typically scan the entire index. It refers to the number of relevant pages the optimizer expects to access.
Data Type Default Value Valid Range Setting Scope
int 512 0 ~ (INT_MAX/3) master; session; reload

min_parallel_table_scan_size


Sets the minimum amount of table data (in BLOCKS) that must be scanned before a parallel table scan is considered.

  • For parallel sequential scans, the scanned table size always equals the table size. However, when using an index, the scanned size is usually smaller.
Data Type Default Value Valid Range Setting Scope
int 8192 0 ~ (INT_MAX/3) master; session; reload

parallel_leader_participation


Allows the leader process to execute parts of the query plan under Gather and Gather Merge nodes instead of waiting for worker processes.

  • Setting this to off reduces the chance that workers are blocked waiting for the leader to read tuples, but requires the leader to wait for workers to start before producing the first tuple.
  • The degree to which the leader helps or hinders performance depends on the plan type, number of workers, and query duration.
Data Type Default Value Setting Scope
boolean on master; system; restart


Postgres Query Optimizer Cost Estimation Parameters

Note!
Do not adjust these parameters. Changes affect YMatrix database hardware configuration and typical workloads. All these parameters are interrelated; changing one without considering side effects may degrade query performance.

cpu_index_tuple_cost


Sets the optimizer's (planner's) estimated cost of processing each index entry during an index scan. Measured as a fraction of a sequential page fetch.

Data Type Default Value Valid Range Setting Scope
floating point 0.005 0 ~ DBL_MAX master; session; reload

cpu_operator_cost


Sets the optimizer's (planner's) estimated cost of processing each operator in a WHERE statement.

Data Type Default Value Valid Range Setting Scope
floating point 0.0025 0 ~ DBL_MAX master; session; reload

cpu_tuple_cost


Sets the optimizer's (planner's) estimated cost of processing each row (tuple) in a query. Measured as a fraction of a sequential page fetch.

Data Type Default Value Valid Range Setting Scope
floating point 0.01 0 ~ DBL_MAX master; session; reload

cursor_tuple_fraction


Sets the optimizer's (planner's) estimate of the fraction of rows that will be retrieved by a cursor.

  • Smaller values make the planner favor "fast start" plans that return initial rows quickly but may take longer to retrieve all rows. Larger values emphasize total estimated time.
  • At the maximum setting of 1.0, cursors are planned like regular queries, considering only total time, not initial row return speed.
Data Type Default Value Valid Range Setting Scope
floating point 0.1 0.0 ~ 1.0 master; session; reload

effective_cache_size


Sets the optimizer's (planner's) assumption about the effective disk buffer size (in BLOCKS) available for a single query.

  • This is considered in index scan cost estimation. Higher values make index scans more likely; lower values make sequential scans more likely.
  • Consider YMatrix shared buffers and kernel disk buffers used for YMatrix data files, though some data may exist in both.
  • Also consider the expected number of concurrent queries sharing the space.
  • This parameter does not affect actual shared memory allocation or reserve kernel disk buffers. It is only for estimation. The system does not assume data remains in disk buffers between queries.
  • gpconfig tools and SHOW commands display effective cache size in "MB" or "kB".
Data Type Default Value Valid Range Setting Scope
int 524288 1 ~ INT_MAX master; session; reload

gp_appendonly_random_page_cost


Sets the optimizer's estimated cost of non-sequential disk page fetches from AO tables.

Data Type Default Value Valid Range Setting Scope
floating point 40.0 0 ~ DBL_MAX master; session; reload

gp_compressed_random_page_cost


Sets the optimizer's estimated cost of non-sequential disk page fetches from compressed AO tables.

Data Type Default Value Valid Range Setting Scope
floating point 400.0 0 ~ DBL_MAX master; session; reload

gp_motion_cost_per_row


Sets the cost estimate for the Motion operator in the Postgres query optimizer (planner), per row transferred between segments, measured as a fraction of sequential page fetch cost.

  • If set to 0, the value used is twice the cpu_tuple_cost value.
Data Type Default Value Valid Range Setting Scope
floating point 0.0 0 ~ DBL_MAX master; session; reload

gp_segments_for_planner


Sets the number of primary instances the Postgres query optimizer (planner) assumes for cost and size estimation.

  • If set to 0, the actual number of primaries is used.
  • This variable affects the optimizer's estimate of the number of rows processed per sender and receiver in Motion operators.
Data Type Default Value Valid Range Setting Scope
int 0 0 ~ INT_MAX master; session; reload

parallel_setup_cost


Sets the optimizer's (planner's) estimated cost of launching parallel worker processes.

Data Type Default Value Valid Range Setting Scope
floating point 1000.0 0 ~ INT_MAX master; session; reload

parallel_tuple_cost


Sets the optimizer's (planner's) estimated cost of passing a tuple from one parallel worker process to another.

Data Type Default Value Valid Range Setting Scope
floating point 0.1 0 ~ DBL_MAX master; session; reload

random_page_cost


Sets the optimizer's (planner's) estimated cost of a non-sequential disk page fetch.

  • This value can be overridden for a specific tablespace by setting the same-named parameter in that tablespace.
  • Higher values (relative to seq_page_cost) make the system favor sequential scans; lower values favor index scans. You can adjust both parameters together to change the relative importance of disk I/O vs CPU cost.
  • Random disk access is typically more than four times costlier than sequential access. However, since most random access (e.g., indexed reads) is assumed to be cached, a lower default (4) is used. This default models random access as 40 times slower than sequential, assuming 90% of random reads are cached.
  • If you believe 90% cache hit rate is incorrect for your workload, increase random_page_cost to better reflect true random read cost. Conversely, if your data fits entirely in cache (e.g., database smaller than total server memory), lowering random_page_cost may be appropriate. For storage with low random read cost (e.g., SSDs), a lower random_page_cost value may be better.

Note!
Although you can set random_page_cost lower than seq_page_cost, physical reality remains unchanged. When all data is in memory, setting them equal is reasonable, as random access incurs no extra cost. Similarly, in highly cached databases, both values should be reduced relative to CPU costs, as fetching pages from RAM is much cheaper. If the entire database is cached in RAM, setting them equal makes sense—there is no additional cost for non-sequential access. In a highly cached database, reduce both values relative to CPU parameters, as fetching a page already in RAM is far cheaper than typical.

Data Type Default Value Valid Range Setting Scope
floating point 4.0 0 ~ DBL_MAX master; session; reload

seq_page_cost


Sets the optimizer's (planner's) estimated cost of a sequential disk page fetch.

  • This value can be overridden for a specific tablespace by setting the same-named parameter in that tablespace.
Data Type Default Value Valid Range Setting Scope
floating point 1.0 0 ~ DBL_MAX master; session; reload


Postgres Genetic Query Optimization Configuration Parameters

GEQO is an algorithm that uses heuristic search for query planning. It reduces planning time for complex queries (those involving multiple table joins), at the cost of potentially suboptimal plans compared to exhaustive search.

geqo


Enables or disables Genetic Query Optimization.

  • It is generally not advisable to disable this in production.
  • The geqo_threshold variable provides finer control over GEQO.
Data Type Default Value Setting Scope
boolean off master; session; reload

geqo_effort


Controls the balance between planning time and effectiveness of query plans in GEQO.

  • Higher values increase planning time but improve the chance of finding a better plan.
  • geqo_effort does not directly affect behavior; it is used to compute default values for other GEQO-related variables.
Data Type Default Value Valid Range Setting Scope
int 0 0 ~ INT_MAX master; session; reload

geqo_generations


Controls the number of generations used by GEQO.

  • A generation means one iteration of the algorithm. It must be at least 1; useful values are in the same range as pool size.
  • If set to 0 (default), YMatrix selects an appropriate value based on geqo_pool_size.
Data Type Default Value Valid Range Setting Scope
int 0 0 ~ INT_MAX master; session; reload

geqo_pool_size


Controls the pool size used by GEQO, i.e., the number of individuals in the genetic population.

  • If set to 0 (default), a suitable value is chosen based on geqo_effort and the number of tables in the query.
Data Type Default Value Valid Range Setting Scope
int 0 0 ~ INT_MAX master; session; reload

geqo_seed


Controls the initial seed for the random number generator used by GEQO to select random paths in the join order search space.

  • The value ranges from 0 (default) to 1. Changing it alters the set of explored join paths and may result in better or worse plans.
Data Type Default Value Valid Range Setting Scope
floating point 0.0 0.0 ~ 1.0 master; session; reload

geqo_selection_bias


Controls the selection bias used by GEQO.

  • Selection bias represents selection pressure within the population. Values range from 1.5 to 2.0, with 2.0 being the default.
Data Type Default Value Valid Range Setting Scope
floating point 1.0 0.0 ~ 100.0 master; session; reload

geqo_threshold


Genetic Query Optimization is used only when the number of FROM items involved is at least this value (note: a FULL OUTER JOIN counts as one FROM item).

  • For simpler queries, the standard exhaustive search planner is used. For queries with many tables, exhaustive search can take too long—longer than the cost of executing a suboptimal plan. Thus, a threshold on query size is a convenient way to manage GEQO usage.
Data Type Default Value Valid Range Setting Scope
int 12 2 ~ INT_MAX master; session; reload


Database Statistics Sampling Parameters

These parameters control the amount of data sampled during ANALYZE operations. Adjusting them affects statistics collection system-wide. Use the ALTER TABLE SET STATISTICS clause to configure statistics collection for specific tables and columns.

default_statistics_target


Sets the default statistics target for table columns that have not had a column-specific target set via ALTER TABLE SET STATISTICS.

  • Larger values increase ANALYZE time but improve the quality of estimates made by the Postgres query optimizer.
Data Type Default Value Valid Range Setting Scope
int 100 0 ~ 10000 master; session; reload


Sort Operator Configuration Parameters

gp_enable_sort_limit


Enables LIMIT operation during sorting. Sorting is more efficient when the plan only needs the first limit_number rows.

Data Type Default Value Setting Scope
boolean on master; session; reload


Aggregate Operator Configuration Parameters

gp_enable_agg_distinct


Enables or disables two-stage aggregation to compute single DISTINCT-aggregate functions.

  • This applies only to subqueries containing a single DISTINCT-aggregate function.
Data Type Default Value Setting Scope
boolean on master; session; reload

--- SPLIT ---

gp_enable_agg_distinct_pruning


Enables or disables three-stage aggregation and joins to compute distinct qualified aggregates.

  • This applies only to subqueries containing one or more DISTINCT-aggregate functions.
Data Type Default Value Scope
boolean on master; session; reload

gp_enable_multiphase_agg


Enables or disables the use of two-stage or three-stage parallel aggregation plans by the Postgres query optimizer.

  • This applies to any subquery with aggregation. If gp_enable_multiphase_agg is off, then gp_enable_agg_distinct and gp_enable_agg_distinct_pruning are disabled.
Data Type Default Value Scope
boolean on master; session; reload

gp_enable_preunique


Enables two-stage deduplication for SELECT DISTINCT queries (not SELECT COUNT(DISTINCT)).

  • When enabled, it adds an extra set of SORT DISTINCT plan nodes before data movement.
  • When DISTINCT operations significantly reduce the number of rows, the cost of this additional SORT DISTINCT is much lower than sending rows over the interconnect.
Data Type Default Value Scope
boolean on master; session; reload

gp_workfile_compression


Specifies whether temporary files created when hash aggregation or hash join operations spill to disk are compressed.

  • If your database installation uses Serial ATA (SATA) disk drives, enabling compression may help avoid overloading the disk subsystem with I/O operations.
Data Type Default Value Scope
boolean off master; session; reload

mx_pullup_agg_proj_and_qual


Pulls up projection fields and moves aggregate evaluations into a separate result node.

Data Type Default Value Scope
boolean off segments; session; reload


Join Operator Configuration Parameters

join_collapse_limit


If the resulting list has no more than this number of items, the Postgres optimizer (planner) rewrites explicit JOIN (except FULL JOIN) constructs into a list of FROM items.

  • Smaller values reduce planning time but may result in suboptimal query plans.
  • By default, this variable is set equal to from_collapse_limit, which is suitable for most cases.
  • Setting it to 1 prevents any reordering of inner joins. Thus, the join order specified in the query becomes the actual execution order.
  • Setting this variable to a value between 1 and from_collapse_limit may help balance planning time against plan quality (higher values may yield better plans).
  • Setting this value to geqo_threshold or higher may trigger the use of the GEQO planner, potentially producing suboptimal plans.
Data Type Default Value Valid Range Scope
int 20 1 ~ INT_MAX master; session; reload

gp_adjust_selectivity_for_outerjoins


Enables selectivity adjustment for NULL tests in outer joins.

Data Type Default Value Scope
boolean on master; session; reload

gp_hashjoin_tuples_per_bucket


Sets the target density of the hash table used by HashJoin operations.

  • Smaller values tend to produce larger hash tables, which can improve join performance.
Data Type Default Value Valid Range Scope
int 5 1 ~ 25 master; session; reload

gp_workfile_compression


Specifies whether temporary files created when hash aggregation or hash join operations spill to disk are compressed.

  • If your database installation uses Serial ATA (SATA) disk drives, enabling compression may help avoid overloading the disk subsystem with I/O operations.
Data Type Default Value Scope
boolean off master; session; reload


Other Postgres Query Optimizer Configuration Parameters

constraint_exclusion


Controls the use of table constraints by the Postgres query optimizer (planner) to optimize queries.

  • Supported values: on (check constraints for all tables), off (never check constraints), and partition (check constraints only for inherited child tables and UNION ALL subqueries). The default is partition. This setting is typically used with traditional inheritance trees to improve performance.
  • When enabled for a specific table, the optimizer compares query conditions with the table's CHECK constraints and skips scanning tables where the conditions contradict the constraints. For example:
    =# CREATE TABLE parent(key integer, ...);
    =# CREATE TABLE child1000(check (key between 1000 and 1999)) INHERITS(parent);
    =# CREATE TABLE child2000(check (key between 2000 and 2999)) INHERITS(parent);
    ...
    =# SELECT * FROM parent WHERE key = 2400;

    With constraint exclusion enabled, this SELECT will skip scanning child1000, improving performance.

  • Currently, constraint exclusion is enabled by default only for table partitioning via inheritance. Enabling it for all tables adds planning overhead, especially for simple queries, without benefit. If no tables are partitioned using inheritance, it is best to disable it entirely. (Note: The equivalent feature for partitioned tables is controlled by a separate parameter, enable_partition_pruning.)
Data Type Default Value Valid Range Scope
enum on true / false / yes / no / 1 / 0 / on / off / partition master; session; reload

force_parallel_mode


Allows parallel queries to be used for testing purposes, even when no performance benefit is expected.

  • Supported values: off (use parallel mode only when performance improvement is expected), on (force parallel query whenever the query is considered safe), and regress (similar to on, but with additional behavioral changes as explained below).
  • Specifically, setting this to on adds a Gather node at the top of any query plan deemed safe for parallel execution, so the query runs inside a parallel worker. Even if no worker is available or usable, operations forbidden in parallel contexts (e.g., starting subtransactions) are prohibited unless the optimizer believes skipping them would cause query failure. When this option is set, functions causing failures or unexpected results may need to be marked PARALLEL UNSAFE (or PARALLEL RESTRICTED).
  • Setting this to regress has all the effects of on, plus additional effects useful for regression testing. Messages from parallel workers normally include a hint line indicating their origin; setting this to regress removes that line, making output identical to non-parallel execution. Similarly, Gather nodes added by this setting are hidden in EXPLAIN output, matching the output when set to off.
Data Type Default Value Valid Range Scope
enum off on / off / regress master; session; reload

from_collapse_limit


If the resulting FROM list does not exceed this value, the Postgres query optimizer (planner) flattens subqueries into the upper-level query.

  • Smaller values reduce planning time but may produce suboptimal query plans.
  • Setting this value to geqo_threshold or higher may trigger the use of the GEQO planner, potentially resulting in suboptimal plans.
Data Type Default Value Valid Range Scope
int 20 1 ~ INT_MAX master; session; reload

gp_enable_predicate_propagation


When enabled, the Postgres query optimizer (planner) applies query predicates to both table expressions when tables are joined on their distribution key columns.

  • Filtering both tables before joining (if possible) is more efficient.
Data Type Default Value Scope
boolean on master; session; reload

gp_max_plan_size


Specifies the maximum uncompressed total size of a query execution plan multiplied by the number of Motion operators (slices) in the plan.

  • If a query plan exceeds this size, the query is canceled and an error is returned.
  • A value of 0 means plan size is not monitored.
  • You can specify values in kB, MB, or GB. The default unit is kB. For example, 200 means 200kB. The value 1GB is equivalent to 1024MB or 1048576kB.
Data Type Default Value Valid Range Scope
int 0 0 ~ (INT_MAX/1024) master; session; superuser

gp_statistics_pullup_from_child_partition


Allows the Postgres query optimizer (planner) to use statistics from child partitions when planning queries on a parent table.

Data Type Default Value Scope
boolean off master; session; reload

gp_statistics_use_fkeys


When enabled, allows the Postgres query optimizer (planner) to use foreign key information stored in the system catalog to optimize joins between foreign and primary keys.

Data Type Default Value Scope
boolean off master; session; reload


Query Plan Execution

gp_max_slices


Specifies the maximum number of slices (parts of a query plan executed on Segment instances) that a query can generate.

  • If a query generates more slices than the specified number, the YMatrix database returns an error and does not execute the query.
  • The default value is 0, meaning no limit.
  • Executing queries that generate many slices may impact database performance. For example, queries with UNION or UNION ALL operators on multiple complex views may generate many slices. You can run EXPLAIN ANALYZE on a query to view its slice statistics.
Data Type Default Value Valid Range Scope
int 0 0 ~ INT_MAX master; session; reload

plan_cache_mode


Prepared statements (explicitly prepared or implicitly generated, e.g., in PL/pgSQL) can be executed using either custom or generic plans.

  • A custom plan is regenerated for each execution using its specific parameter values. A generic plan is independent of parameter values and can be reused across executions. Using generic plans saves planning time, but may be inefficient if the optimal plan depends heavily on parameter values.
  • This setting is considered at execution time when using a cached plan, not at plan preparation time.
Data Type Default Value Valid Range Scope
enum auto auto / force_custom_plan / force_generic_plan master; session; reload


JIT (Just-In-Time) Configuration Parameters

jit


Determines whether YMatrix allows JIT compilation.

Data Type Default Value Scope
boolean on master; session; reload

jit_above_cost


Sets the query cost threshold for activating JIT (Just-In-Time) compilation. If the query cost exceeds this value, JIT compilation is activated.

  • JIT consumes planning time but can accelerate query execution.
  • Setting this to -1 disables JIT compilation.
Data Type Default Value Valid Range Scope
floating point 100000 -1 ~ DBL_MAX master; session; reload

jit_debugging_support


If LLVM has the required capability, registers generated functions with GDB. This makes debugging easier.

  • This parameter can only be set at server startup or client connection startup.
Data Type Default Value Scope
boolean off master; session; reload; superuser

jit_dump_bitcode


Writes the generated LLVM IR to the file system under data_directory.

  • This is useful only for internal JIT implementation debugging.
  • Only superusers and users with the appropriate SET privilege can change this setting.
Data Type Default Value Scope
boolean off master; session; reload; superuser

jit_expressions


When JIT compilation is activated, allows JIT compilation of expressions.

Data Type Default Value Scope
boolean on master; session; reload

jit_inline_above_cost


Sets the query cost threshold at which JIT compilation attempts to inline functions and operators.

  • If the query cost exceeds this value, JIT attempts inlining. Inlining increases planning time but may improve execution speed.
  • Setting this parameter below jit_above_cost is ineffective.
  • Setting it to -1 disables inlining.
Data Type Default Value Valid Range Scope
floating point 500000 -1 ~ DBL_MAX master; session; reload

jit_optimize_above_cost


Sets the query cost threshold at which JIT compilation applies expensive optimizations.

  • If the query cost exceeds this value, JIT applies costly optimizations. These increase planning time but may significantly improve execution speed.
  • Setting this below jit_above_cost is ineffective, and setting it above jit_inline_above_cost may not be beneficial.
  • Setting it to -1 disables expensive optimizations.
Data Type Default Value Valid Range Scope
floating point 500000 -1 ~ DBL_MAX master; session; reload

jit_profiling_support


If LLVM has the required capability, emits data needed by the perf command to profile JIT-generated functions.

  • This writes files to ~/.debug/jit/. If the environment variable JITDUMPDIR is set and loaded, files are written to JITDUMPDIR/debug/jit instead.
  • You are responsible for cleanup when needed.
  • This parameter can only be set at server startup or client connection startup.
Data Type Default Value Scope
boolean off master; session; reload; superuser

jit_tuple_deforming


When JIT compilation is activated, allows JIT compilation of tuple deforming.

Data Type Default Value Scope
boolean on master; session; reload


Greenplum ORCA Optimizer Configuration Parameters

gp_enable_relsize_collection


If a table lacks statistics, allows the ORCA optimizer and Postgres query optimizer to use the table's estimated size (pg_relation_size function).

  • By default, if statistics are unavailable, ORCA and the planner use default values to estimate row counts.
  • The default behavior may improve query optimization time and reduce resource queue usage under heavy workloads, but could lead to suboptimal plans.
  • This parameter is ignored for root partitions of partitioned tables.
  • When ORCA is enabled and the root partition lacks statistics, ORCA always uses default values. Use ANALZYE ROOTPARTITION to collect statistics on the root partition. See ANALYZE.
Data Type Default Value Scope
boolean off master; session; reload

optimizer


Enables or disables GPORCA when running SQL queries.

  • Default is on. If GPORCA is disabled, the YMatrix database uses only the Postgres query optimizer.
  • GPORCA coexists with the Postgres query optimizer. When enabled, YMatrix uses GPORCA to generate execution plans whenever possible. If GPORCA cannot be used, the Postgres optimizer is used.
  • The optimizer parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean on master; session; reload

optimizer_analyze_root_partition


For partitioned tables, controls whether the ANALYZE command requires the ROOTPARTITION keyword to collect statistics on the root partition.

  • GPORCA uses root partition statistics when generating query plans. The Postgres query optimizer does not use them.
  • The default setting is on, meaning ANALYZE can collect root partition statistics without the ROOTPARTITION keyword. Statistics are collected when running ANALYZE on the root partition, or when running ANALYZE on a child leaf partition if other leaf partitions have statistics.
  • When set to off, you must run ANALZYE ROOTPARTITION to collect root partition statistics.
  • When the server configuration parameter optimizer is set to on (default), this parameter should also be set to on.
  • For more information on collecting statistics for partitioned tables, see ANALYZE.
Data Type Default Value Scope
boolean on master; system; restart

optimizer_array_expansion_threshold


When GPORCA is enabled (default) and processing queries with predicates containing constant arrays, the optimizer_array_expansion_threshold parameter limits optimization based on the number of constants in the array.

  • If an array in a query predicate contains more elements than specified, GPORCA disables predicate expansion into disjunctive normal form during query optimization. For example, when GPORCA executes a query with a IN clause containing more than 100 elements, it does not convert the predicate during optimization to reduce planning time and memory usage. The difference in query processing can be seen in the filter conditions of the EXPLAIN clause in the query IN plan.
  • Changing this parameter adjusts the trade-off between shorter optimization time and lower memory usage versus potential benefits from constraint derivation during optimization, such as conflict detection and partition elimination.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Valid Range Scope
int 100 0 ~ INT_MAX master; session; reload

optimizer_control


Controls whether users can change the server configuration parameter optimizer using SET, RESET commands, or the YMatrix utility gpconfig.

  • If the optimizer_control parameter is enabled, users can set optimizer parameters.
  • If the optimizer_control parameter is set to off, optimizer parameters cannot be changed.
Data Type Default Value Scope
boolean on master; system; restart; superuser

optimizer_cost_model


When GPORCA is enabled (default), this parameter controls the cost model GPORCA uses for bitmap scans, which are used with bitmap indexes or btree indexes on AO tables.

  • legacy - Retains the calibrated cost model used by GPORCA in Greenplum Database versions 6.13 and earlier.
  • calibrated - Improves cost estimation for indexes.
  • experimental - Reserved for future experimental cost models; currently equivalent to calibrated.
  • The default calibrated cost model is more likely to choose faster bitmap index scans with nested loop joins instead of hash joins.
Data Type Default Value Valid Range Scope
enum calibrated legacy / calibrated / experimental master; session; reload

optimizer_cte_inlining_bound


When GPORCA is enabled (default), this parameter controls the amount of inlining performed for common table expression (CTE) queries (queries with WHERE clauses).

  • The default value 0 disables inlining.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Valid Range Scope
int 0 0 ~ INT_MAX master; session; reload

optimizer_dpe_stats


When the Greenplum ORCA optimizer is enabled (default) and this parameter is on (default), GPORCA derives statistics to more accurately estimate the number of rows scanned during dynamic partition elimination.

  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean on master; session; reload

optimizer_enable_associativity


When GPORCA is enabled (default), this parameter controls whether join associativity transformations are enabled during query optimization.

  • This transformation analyzes join order. For the default value off, only the GPORCA dynamic programming algorithm for join order analysis is enabled.
  • Join associativity transformations largely duplicate the functionality of the newer dynamic programming algorithm.
  • If set to on, GPORCA can use associativity transformations during query optimization.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean off master; session; reload

optimizer_enable_dml


When GPORCA is enabled (default) and this parameter is on (default), GPORCA attempts to process DML commands such as INSERT, UPDATE, and DELETE.

  • If GPORCA cannot process the command, the YMatrix database falls back to the Postgres optimizer.
  • When set to off, Greenplum always falls back to the Postgres optimizer for DML commands.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean on master; session; reload

optimizer_enable_indexonlyscan


When GPORCA is enabled (default) and this parameter is on (default), GPORCA can generate index-only scan plans for B-tree indexes.

  • GPORCA accesses only index values, not the table's data blocks.
  • This improves query execution performance, especially when the table has been vacuumed, has wide columns, and GPORCA does not need to fetch any data blocks (e.g., visibility is confirmed).
  • When disabled (off), GPORCA does not generate index-only scan plans.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean on master; session; reload

optimizer_enable_multiple_distinct_aggs


When GPORCA is enabled (default), this parameter allows GPORCA to support multiple distinct qualified aggregates, such as SELECT count(DISTINCT a),sum(DISTINCT b) FROM foo.

  • This parameter is disabled by default because the resulting plans are often suboptimal compared to those generated by the Postgres optimizer.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean off master; session; reload

optimizer_force_agg_skew_avoidance


When GPORCA is enabled (default), this parameter affects the query plan alternatives GPORCA considers when generating three-stage aggregation plans.

  • When set to on, GPORCA considers only three-stage aggregation plans where intermediate aggregation uses GROUP BY and DISTINCT columns for distribution to reduce skew impact.
  • If set to off, GPORCA may also consider plans that distribute using the GROUP BY column.
  • Such plans may perform poorly when skew exists.
Data Type Default Value Scope
boolean on master; session; reload

optimizer_force_multistage_agg


With the default setting, when GPORCA is enabled and this parameter is on, GPORCA selects three-stage aggregation plans for scalar distinct qualified aggregates when such plan alternatives are available.

  • When set to off, GPORCA chooses based on cost rather than heuristics.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean off master; session; reload

optimizer_force_three_stage_scalar_dqa


With the default setting, GPORCA is enabled and this parameter is on, GPORCA selects multi-stage aggregation plans when such alternatives are available.

  • When set to off, GPORCA chooses based on cost rather than heuristics.
  • This parameter can be set at the database system, individual database, session, or query level.
Data Type Default Value Scope
boolean on master; session; reload

optimizer_join_arity_for_associativity_commutativity


This value is an optimization hint that limits the number of join associativity and commutativity transformations explored during query optimization.

  • This limit controls the number of alternative plans GPORCA considers. For example, the default value 18 tells GPORCA to stop exploring associativity and commutativity transformations when an n-ary join operator has more than 18 children.
  • For queries with many joins, specifying a lower value may improve query performance by limiting the number of alternative plans evaluated.
  • However, setting the value too low may result in suboptimal execution plans.
  • This parameter has no effect when the optimizer_join_order parameter is set to query or greedy.
  • This parameter can be set at the database system or session level.
Data Type Default Value Valid Range Scope
int 18 0 ~ INT_MAX segments; system; reload

--- SPLIT ---

optimizer_join_order


When GPORCA is enabled, this parameter sets the optimization level for join ordering during query optimization by specifying the type of join order alternatives to evaluate.

  • query - Use the join order specified in the query.
  • greedy - Compute the join order specified in the query and alternatives based on the smallest cardinality of relations in the joins.
  • exhaustive - Apply transformation rules to find and evaluate all possible join order alternatives.
  • The default is exhaustive. Setting this parameter to query or greedy may generate suboptimal query plans. However, if the administrator is confident that satisfactory plans are produced with query or greedy, lowering the optimization level can improve query optimization time.
  • Setting this parameter to query or greedy overrides the optimizer_join_order_threshold and optimizer_join_arity_for_associativity_commutativity parameters.
  • This parameter can be set at the database, session, or query level.
Data Type Default Value Value Range Setting Classification
enum exhaustive exhaustive / greedy / query master; session; reload

optimizer_join_order_threshold


When GPORCA is enabled (default), this parameter sets the maximum number of join inputs for which GPORCA uses a dynamic programming-based join ordering algorithm.

  • You can set this value for an individual query or for the entire session.
  • This parameter has no effect when the optimizer_join_query parameter is set to query or greedy.
Data Type Default Value Value Range Setting Classification
int 10 0 ~ 12 master; session; reload

optimizer_mdcache_size


Sets the maximum amount of memory (in KB) on the YMatrix database Master that GPORCA uses to cache query metadata (optimization data) during query optimization.

  • The cache is session-based.
  • GPORCA caches query metadata during optimization by default when GPORCA is enabled and optimizer_metadata_caching is enabled.
  • The default value is 16384(16MB), which is the optimal value determined through performance analysis.
  • You can specify values in KB, MB, or GB. The default unit is KB. For example, a value of 16384 means 16384KB. A value of 1GB is equivalent to 1024MB or 1048576KB.
  • If the value is 0, the cache size is unlimited.
  • This parameter can be set at the system, database, session, or query level.
Data Type Default Value Value Range Setting Classification
int 16384 0 ~ INT_MAX master; session; reload

optimizer_metadata_caching


When GPORCA is enabled (default), this parameter specifies whether GPORCA caches query metadata (optimization data) in memory on the YMatrix database Master during query optimization.

  • The cache is session-based. It is released when the session ends. If the amount of metadata exceeds the cache size, older unused metadata is evicted.
  • If the value is off, GPORCA does not cache metadata during query optimization.
  • This parameter can be set at the system, database, session, or query level.
  • The server configuration parameter optimizer_mdcache_size controls the size of the query metadata cache.
Data Type Default Value Setting Classification
boolean on master; session; reload

optimizer_parallel_union


When GPORCA is enabled (default), optimizer_parallel_union controls the degree of parallelization for queries containing UNION or UNION ALL clauses.

  • When the value is off, GPORCA generates a query plan where each child of the APPEND(UNION) operator resides in the same slice as the APPEND operator.
  • During query execution, child nodes are executed sequentially.
  • When this value is on, GPORCA generates a query plan with a redistribute Motion node under the APPEND(UNION) operator. During execution, the children and parent APPEND operators reside on different slices, allowing the children of the APPEND(UNION) operator to execute in parallel across Segment instances.
  • This parameter can be set at the system, database, session, or query level.
Data Type Default Value Setting Classification
boolean off master; session; reload

optimizer_penalize_broadcast_threshold


When GPORCA is enabled (default), during query optimization, GPORCA penalizes the cost of plans that attempt to broadcast relations exceeding the number of rows specified by this parameter.

  • For example, if this parameter is set to 100K rows (default), any broadcast exceeding 100K rows is heavily penalized.
  • When this parameter is set to 0, GPORCA sets the broadcast threshold to unlimited and never penalizes broadcast operations.
Data Type Default Value Value Range Setting Classification
int 100000 0 ~ INT_MAX master; session; reload

optimizer_penalize_skew


When GPORCA is enabled (default), this parameter allows GPORCA to penalize the segment cost of HashJoin operators that have a skewed Redistribute Motion as a child, favoring Broadcast Motion during query optimization.

  • GPORCA detects skew in a redistribute motion when the NDV (number of distinct values) is less than the number of segments.
  • This parameter can be set at the system, database, session, or query level.
Data Type Default Value Setting Classification
boolean on master; session; reload

optimizer_print_optimization_stats


When GPORCA is enabled (default), this parameter enables logging of GPORCA query optimization statistics for various optimization phases.

  • The default value is off; optimization statistics are not logged.
  • To log optimization statistics, this parameter must be set to on, and the parameter client_min_messages must be set to log.
    • set optimizer_print_optimization_stats = on;
    • set client_min_messages = 'log';
  • Information is logged during query execution or when using the EXPLAIN or EXPLAIN ANALYZE commands.
  • This parameter can be set at the system, database, session, or query level.
Data Type Default Value Setting Classification
boolean off master; session; reload

optimizer_sort_factor


When GPORCA is enabled (default), optimizer_sort_factor controls the cost factor applied to sort operations during query optimization.

  • The default value 1 specifies the default sort cost factor.
  • The value represents a multiplier that increases or decreases the default factor. For example, a value of 2.0 sets the cost factor to twice the default, and a value of 0.5 sets it to half the default.
  • This parameter can be set at the system, database, session, or query level.
Data Type Default Value Value Range Setting Classification
floating point 1.0 0.0 ~ DBL_MAX master; session; reload


Runtime Filter Optimization Configuration Parameters

mx_enable_runtime_filter


Enables or disables Runtime Filter.

Data Type Default Value Setting Classification
boolean on master; session; reload

mx_runtime_join_ratio


Runtime Filter is not triggered when the ratio of outer table rows to inner table rows in a join is less than this value.

Data Type Default Value Value Range Setting Classification
int 10 0 ~ (INT_MAX/2) master; session; reload

mx_runtime_max_filter_per_query


Maximum number of Runtime Filters allowed per query.

Data Type Default Value Value Range Setting Classification
int 4 0 ~ (INT_MAX/2) master; session; reload

mx_runtime_max_inner_rows


Runtime Filter is not triggered when the number of rows in the inner table of a join exceeds this value.

Data Type Default Value Value Range Setting Classification
int 2000000 0 ~ (INT_MAX/2) master; session; reload

mx_runtime_max_intent_level


Runtime Filter is not triggered when the number of rows in the outer table of a join is less than this value.

Data Type Default Value Value Range Setting Classification
int 3 0 ~ (INT_MAX/2) master; session; reload

mx_runtime_min_outer_rows


Runtime Filter is not triggered when the number of rows in the outer table of a join is less than this value.

Data Type Default Value Value Range Setting Classification
int 500000 0 ~ (INT_MAX/2) master; session; reload

mx_runtime_scan_ratio


Runtime Filter is not triggered when the ratio of scanned rows in the inner table to rows satisfying the join condition is less than this value.

Data Type Default Value Value Range Setting Classification
int 25 0 ~ (INT_MAX/2) master; session; reload


Vectorized Execution Engine Configuration Parameters

matrix.enable_mxvector


Enables or disables the vectorized execution engine.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_aggregate


Enables the vectorized execution engine to use grouped aggregate plans.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_hash_aggregate


Enables the vectorized execution engine to use hash aggregate plans.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_append


Enables the vectorized execution engine to use append plan types.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_motion


Enables the vectorized execution engine to use Motion plan types.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_presort_aggregate


Enables the vectorized execution engine to use aggregate plan types with explicit pre-sorting.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_root


Enables the root executor of the vectorized execution engine.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_sort


Enables the vectorized execution engine to use sort plan types.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_strict_mode


Enables strict mode for the vectorized execution engine.

  • If an expression that cannot be handled by vectorization is encountered, an exception is thrown immediately, and the server stops processing the current query.
Data Type Default Value Setting Classification
boolean false master; session; reload

matrix.enable_mxv_fallback_expression


Enables fallback expressions for the vectorized execution engine.

  • If this parameter is disabled, encountering an expression that the vectorized executor cannot handle causes the entire query to fall back to a scalar plan (with no vectorized nodes).
Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_limit


Enables the vectorized execution engine to use limit plan types.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_window


Enables the vectorized execution engine to use window aggregate plan types.

Data Type Default Value Setting Classification
boolean false master; session; reload

matrix.enable_mxv_subqueryscan


Enables the vectorized execution engine to use subquery scan plan types.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_hash_join


Enables the vectorized execution engine to use hash join plan types.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_hash_join_advanced_hashmap


Enables advanced hash map for vectorized hash joins.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_aggscan


Enables the vectorized execution engine to use aggregate scan plan types.

Data Type Default Value Setting Classification
boolean true master; session; reload

matrix.enable_mxv_indexscan


Enables the vectorized execution engine to use index scans.

Data Type Default Value Setting Classification
boolean true master; session; reload