ORCA Optimizer

ORCA is a new modular query optimizer in YMatrix that replaces the traditional PostgreSQL planner. It delivers significant performance improvements for certain OLAP query workloads. By default, the ORCA optimizer is disabled in YMatrix. To use ORCA, you must explicitly enable the optimizer configuration parameter. When enabled, YMatrix attempts to use ORCA to generate execution plans for queries. If ORCA cannot handle a query, the system falls back to the PostgreSQL-based planner.

ORCA Optimizer Architecture

The system boundary of ORCA is shown in the figure below. The ORCA optimizer resides at the bottom, and the database system sits above it.

ORCA accepts a query (Query) and metadata (MetaData, or MD) from the database system as input and outputs a physical execution plan (Plan). All three—Query, MD, and Plan—are represented in DLX format.

Image

The internal structure of ORCA is illustrated in the following diagram, which includes components such as Job Execution, Job Scheduler, Memo, and OS.

Image

  • Memo:

    • Stores all possible alternative plans.
    • Each Memo Group holds logically equivalent expressions.
    • Physical plans are derived directly from the Memo, and their costs are computed during optimization.
  • Search: Represents three types of jobs:

    • Exploration: Logically expands the Memo to explore alternative logical plans.
    • Implementation: Converts logical plans into physical plans.
    • Optimization: Adds Enforcer expressions based on required output properties and computes plan costs.
  • Job Scheduler: Orchestrates and executes optimization jobs.

  • Transformations: Rules that transform expressions in the Memo. There are two main categories:

    • Logical plan → Logical plan (used by exploration jobs).
    • Logical plan → Physical plan (used by implementation jobs).
  • Property Enforcement: Enhances data properties.

    • Describes required output properties for each operator.
    • If an operator’s output does not satisfy the required properties, an Enforcer is inserted to enforce them.
  • Metadata Cache: Caches metadata accessed during optimization.

  • OS: Low-level library handling I/O, exceptions, memory management, concurrency, and other system services.

Enabling and Disabling ORCA

When optimizer = on, YMatrix uses ORCA instead of the legacy planner. You can control ORCA behavior using the optimizer server configuration parameter at the system, database, session, or query level. To change the default setting, follow one of the procedures below.

Enable ORCA System-Wide

Set the optimizer server configuration parameter for the entire YMatrix system:

  1. Log in to the YMatrix Master host as mxadmin (the YMatrix database administrator).
  2. Use the gpconfig utility to set the parameter to on:
    gpconfig -c optimizer -v on
  3. Reload the configuration on both Master and Segments without restarting the cluster:
    mxstop -u

Enable ORCA for a Specific Database

Use the ALTER DATABASE command to enable ORCA for a single database. For example, to enable ORCA for database test_db:

ALTER DATABASE test_db SET OPTIMIZER = ON;

Enable ORCA for a Session or Query

Use the SET command to enable ORCA for the current session. For example, after connecting via psql:

SET optimizer = on;

To enable ORCA for a specific query, include the SET command immediately before running the query.

Note!
For more information about ORCA-related configuration parameters, see ORCA Optimizer Configuration Parameters.

How to Determine Whether ORCA Is Used

When ORCA is enabled (it is off by default), you can verify whether YMatrix used ORCA or fell back to the legacy planner by examining the output of EXPLAIN.

  • If ORCA generated the plan, the plan output ends with a line indicating the optimizer and its version. For example:

    Optimizer: ORCA
  • If the optimizer parameter is off, or if ORCA was enabled but the query fell back to the legacy planner, the output shows:

    Optimizer: Postgres-based planner

Note!
When YMatrix falls back to the legacy planner, the query plan is influenced by the legacy optimizer’s configuration parameters. For details, see Operator Control Parameters.

Unsupported SQL Features with ORCA

The following SQL features are not supported when ORCA is enabled:

  • Index expressions (indexes defined on expressions over one or more table columns).
  • GIN index access method. ORCA supports only B-tree, bitmap, and GiST indexes. Indexes created with unsupported methods are ignored.
  • External parameters.
  • The following types of partitioned tables:
    • Non-uniform partitioned tables.
    • Partitioned tables modified to use an external table as a leaf child partition.
  • Sort-Merge Join (SMJ).
  • Ordered aggregates.
  • Analytic extensions:
    • CUBE
    • Multiple grouping sets
  • Scalar operators:
    • ROW
    • ROWCOMPARE
    • FIELDSELECT
  • Aggregate functions that take set-returning operators as input arguments.
  • *`percentile_` window functions** (YMatrix does not support ordered-set aggregate functions).
  • Inverse distribution functions.
  • Queries that execute functions defined with the ON MASTER or ON ALL SEGMENTS attributes.
  • Queries containing Unicode characters in metadata names (e.g., table names) that are incompatible with the host system’s locale.
  • SELECT, UPDATE, and DELETE commands that use the ONLY keyword to reference a table name.