Data table category parameters

This document introduces the relevant parameters of the data table category in the system configuration parameters.

Notes!
To ensure the stability and security of the system, please be sure to manually modify the relevant parameters**.


HEAP table/append optimization table (AO) configuration parameters

default_table_access_method


When the CREATE TABLE command does not explicitly specify the access method, set the default table access method.

Data Type Default Value Value Range Set Classification
enum heap heap / ao_row / ao_column / mars2 / mars3 master; session

gp_create_table_random_default_distribution


Controls the creation of the table when using CREATE TABLE or CREATE TABLE AS that does not contain the DISTRIBUTED BY clause.

  • For CREATE TABLE, if this parameter value is off (default) and the Create Table command does not contain the DISTRIBUTED BY clause, the YMatrix database will select the table distribution key according to the following command:
    • If the LIKE or INHERITS clause is specified, YMatrix will copy the distribution key from the source table or the parent table.
    • If the PRIMARY KEY or UNIQUE constraint is specified, YMatrix selects the largest subset of all key columns as the distributed key.
    • If neither the constraint nor the LIKE or INHERITS clause is specified, YMatrix selects the first appropriate column as the distribution key. (Columns with geometric or user-defined data types do not meet the criteria for YMatrix distribution key columns.)
  • If the value of the parameter is set to on, the YMatrix database follows these rules to create the table when the DISTRIBUTED BY clause is not specified:
    • If the PRIMARY KEY or UNIQUE column is not specified, the distribution of the table is random (DISTRIBUTED RANDOMLY). Table distribution is random even if the table creation command contains LIKE or INHERITS clause.
    • If the PRIMARY KEY or UNIQUE column is specified, the DISTRIBUTED BY clause must also be specified. If the DISTRIBUTED BY clause is not specified as part of the table creation command, the command will fail.
  • For the CREATE TABLE AS command that does not contain a distribution clause:
    • If the Postgres query optimizer creates a table and the parameter has a value of off, the table distribution policy is determined based on the command.
    • If the Postgres query optimizer creates a table and the value of the parameter is on, the table distribution policy is random.
    • If Greenplum ORCA creates a table, the table distribution policy is random. The parameter value has no effect.

| Data Type | Default | Set Category | | --- | --- | --- | --- | | boolean | off | master; system; reload |

gp_default_storage_options


When creating a table using the CREATE TABLE command, set the default values ​​​​​for the following table storage options.- The supported items are as follows, specifying multiple storage option values ​​​​​as comma-separated lists:

  • appendoptimized. Note: You can use the appendoptimized=value syntax to specify the appendoptimized table storage type. appendoptimized is a thin alias for appendonly traditional storage options. The YMatrix database stores appendonly in the catalog and displays the same content when listing the storage options for appending optimization tables.
  • blocksize
  • checksum
  • compressiontype
  • compresslevel
  • orientation
  • Instead of specifying table storage options in the WITH of the CREATE TABLE command, you can set storage options using this parameter. The table storage option specified with the CREATE TABLE command overrides the value specified by this parameter.
  • Not all combinations of storage option value are valid. If the specified storage option is invalid, an error is returned. For information about table storage options, see the CREATE TABLE command.
  • Default values ​​​​​can be set for databases and users. If the server configuration parameters are set at different levels, this is the priority for the table to store values ​​​​​from highest to lowest when the user logs in to the database and creates the table:
      1. The value specified in the CREATE TABLE command using the WITH clause or the ENCODING clause
      1. Use the ALTER ROLE...SET command to set the value of gp_default_storage_options for the user
      1. Use the ALTER DATABASE...SET command to set the value of gp_default_storage_options for the database
      1. Use the gpconfig tool to set the value of gp_default_storage_options for the YMatrix database system
  • Parameter values ​​​​​are not cumulative. For example, if the parameter specifies the appendoptimized and compression options for the database and the user logs in and sets the parameters to specify the values ​​​​of the orientation option, the appendoptimized and compression types ​​​set at the database level are ignored.
  • This example ALTER DATABASE command sets the default orientation and compression table storage options for the database mytest: ALTER DATABASE mytest SET gp_default_storage_options = 'orientation=column, compressiontype=rle_type'. Create appended tables in the mytest database using column-oriented tables and RLE compression. You need to specify only appendoptimized=TRUE in the WITH clause.
  • This example gpconfig tool command sets the default storage options for the YMatrix database system. If you set a default value for multiple table storage options, the value must be enclosed in single quotes: gpconfig -c 'gp_default_storage_options' -v 'appendoptimized=true, orientation=column'.
  • This example gpconfig tool command displays the value of the parameter. Parameter values ​​​​​must be consistent between the Master of the YMatrix database and all Segments: gpconfig -s 'gp_default_storage_options'.
Default value Value range Set classification
appendoptimized=FALSE
blocksize=32768
checksum=TRUE
compresstype=none
compresslevel=0
orientation=ROW
appendoptimized= TRUE / FALSE
blocksize= integer between 8192 and 2097152
checksum= TRUE / FALSE
compresstype= ZLIB / ZSTD / QUICKLZ2 / RLE_TYPE / NONE
compresslevel= integer between 0 and 19
orientation= ROW / COLUMN
Use the gpconfig tool to specify: master; session; reload
Specify: table

gp_appendonly_compaction


Enable compressed Segment files during the VACUUM command.

  • When disabled, VACUUM only truncates the Segment file to the EOF value, and the current behavior is the same.
  • Administrators may want to disable compression in high I/O load situations or low space conditions.

| Data Type | Default | Set Category | | --- | --- | | boolean | on | master; session; reload; superuser |

gp_appendonly_compaction_threshold

Specifies the threshold ratio (percentage) of hidden rows when running VACUUM without the FULL option (lazy VACUUM).

  • If the ratio of hidden lines in the Segment file on the Segment is less than this threshold, the Segment file is not compressed and a log message is issued.
Data Type Default Value Value Range Set Classification
int 10 0 ~ 100 master; session; reload

gp_safefswritesize


Specifies the minimum size for safe write operations to AO tables in an immature file system.- When specifying a number of bytes greater than 0, the appended optimized writer adds the fill data to that number to prevent data corruption due to file system errors.

  • Each immature file system has a known secure write size and must be specified here when using a YMatrix database with that type of file system.
  • This is usually set to a multiple of the range size of the file system; for example, Linux ext3 is 4096 bytes, so the value 32768 is usually used.
Data Type Default Value Value Range Set Classification
int 0 0 ~ INT_MAX master; system; restart


MARS2 table configuration parameters

compress_threshold


Compression threshold. Used to control how many tuples (tuples) are compressed at one time, and is the upper limit of the number of Tuples compressed in the same unit.

Data Type Default Value Value Range Set Classification
int 1000 1 ~ 10000 table

mars2_automerge_threshold


Used to control all MARS2 tables, how many Runs can trigger merges when L0 reaches.

  • If you want to specify individually for a form you can use the table option level0_merge_threshold.
  • If this parameter is set at the system level, reload is required to take effect; if it is only modified at the session level, it will take effect directly without reloading the configuration.
Data Type Default Value Value Range Set Classification
int 32 10 ~ 2048 master; session/system; reload

level0_merge_threshold


Used to control how many Runs to trigger merges in a single table L0.

Data Type Default Value Value Range Set Classification
int 32 1 ~ 2048 table

level0_upgrade_size


Controls the size of L0 -> L1 upgrade (MB). When L0 merges the result Run exceeds this size, it will be upgraded to L1.

Data Type Default Value Value Range Set Classification
int 25 1 ~ 10000 table

level1_upgrade_size


Controls the size of L1 -> L2 upgrade (MB) in single table. When L1 merges, the result Run exceeds this size will be upgraded to L2.

Data Type Default Value Value Range Set Classification
int 1000 1 ~ 10000 table

mars2_sort_mem


Controls the sorted memory size (KB) of a single insert, which will share if the insert target table is a partitioned table.

  • The default value is 2097152KB, i.e. 2GB. The value range is 128KB ~ 2147483647KB (i.e. 2048GB).
  • If this parameter is set at the system level, reload is required to take effect; if it is only modified at the session level, it will take effect directly without reloading the configuration.
Data Type Default Value Value Range Set Classification
int 2097152 128 ~ 2147483647 master; session/system; reload

mars2_sort_mem_core


Controls at least how much sorted memory is allocated for a single partition table.

  • The default value is 16384KB, i.e. 16MB. The value range is 128KB ~ 2147483647KB (i.e. 2048GB).
  • If this parameter is set at the system level, reload is required to take effect; if it is only modified at the session level, it will take effect directly without reloading the configuration.
Data Type Default Value Value Range Set Classification
int 16384 128 ~ 2147483647 master; session/system; reload


MARS3 table configuration parameters

Notes!
This part of the parameters must be configured using the WITH clause when creating the table. It is suitable for a single table and cannot be modified once it is configured.

compress_threshold


Compression threshold. Used to control how many tuples (tuples) are compressed at one time, and is the upper limit of the number of Tuples compressed in the same unit.

  • Used to adjust the Range size, it is also the basic unit of compression. It can be used to adjust the compression effect and improve the reading efficiency. If the compression effect is too low, the compression effect is not obvious, and the configuration is too high to consume more memory.
Data Type Default Value Value Range Set Classification
int 1200 1 ~ 100000 table

level_size_amplifier


Specifies the magnification factor for the Level size.

  • Level The threshold for triggering a merge operation, calculated as: rowstore_size * (level_size_amplifier ^ level). The larger the value, the slower the reading speed and the faster the writing speed.
  • The specific value can be determined based on the specific scene information (write more read less/read more read less, compression rate, etc.).
  • Note: Make sure that the number of runs per Level does not exceed the number of runs, otherwise it will affect query performance and even prevent new data from being inserted.
Data Type Default Value Value Range Set Classification
int 8 1 ~ 1000 table

prefer_load_mode


Specifies the data loading mode inside MARS3.

  • normal represents the normal mode. The newly written data is first written to the row storage run of L0 layer, and then accumulated to the column storage run of L1 layer. Compared with the bulk mode, the column storage conversion will change from synchronous to asynchronous, but it is suitable for high-frequency small batch write scenarios with sufficient I/O capabilities and delay-sensitive.
  • bluk represents a batch loading mode, suitable for low-frequency large-batch writing scenarios, directly write to the column storage run of L1 layer. Compared with normal mode, one I/O is reduced. The column storage conversion has changed from asynchronous to synchronous, suitable for low-frequency large-batch writing of data with insufficient I/O capabilities and insensitive to delay.
  • For more information about the principles, please see MARS3 principle overview.
Data Type Default Value Value Range Set Classification
enum normal normal / bluk table


rowstore_size


Used to control when L0 Run switches. When the data size exceeds the parameter value (in MB), the next Run will be switched.

  • It is proportional to read and inversely proportional to write and amplification. The specific value can be determined based on the specific scenario (more writes, less reads, more reads, less reads).
Data Type Default Value Value Range Set Classification
int 64 8 ~ 1024 table


MARS3 downgrade storage configuration parameters

matrixts.enable_object_cache


Turn on/off file system cache for object storage.

Data Type Default Set Category
boolean off master; session; reload

mars3.degrade_credential_update_interval


Sets how often to update the access authorization for the session.

  • In order to improve the security of object storage, keys are updated regularly in some businesses. This parameter can control how long the session (unit is s) to re-collect authorization information from the system directory of the tablespace to ensure that the latest authorization configuration is used when accessing object storage.
Data Type Default Value Value Range Set Classification
int 0 0 ~ INT_MAX master; system; reload

mars3.degrade_probe_interval


The database background task regularly detects the time interval (unit s) of whether the data dump requirements meets the time interval (unit s).

  • After modification, you must restart the loading configuration to take effect.
Data Type Default Value Value Range Set Classification
int 3600 1 ~ INT_MAX master; system; reload

Turn on/off the memory cache of object storage.

Data Type Default Set Category
boolean off master; session; reload

mars3.enable_objectstore


Turn on/off the automatic dump function.

  • After modification, you must restart the loading configuration to take effect.
Data Type Default Set Category
boolean off master; system; reload


External table configuration parameters

gp_external_enable_exec


Enable or disable the use of external tables that execute operating system commands or scripts on the Segment host (CREATE EXTERNAL TABLE EXECUTE syntax).

  • If you use the Command Center or MapReduce feature, Must enable.
Data Type Default Set Category
boolean on master; system; restart

gp_external_enable_filter_pushdown


Enable filter pushdown when reading data from an external table.

  • If the pushdown fails, a query is executed without pushing the filter to an external data source (in contrast, the YMatrix database applies the same constraints to the result).
Data Type Default Set Category
boolean on master; session; reload

gp_external_max_segs


Set the number of segments to scan external table data during external table operations, with the goal of preventing the system from overloading the scan data and taking away resources from other concurrent operations.

  • This is only applicable to external tables that use the gpfdist:// protocol to access external table data.
Data Type Default Value Value Range Set Classification
int 64 1 ~ INT_MAX master; session; reload

gp_initial_bad_row_limit


Assuming this parameter value is n, the YMatrix database stops processing the input row if the previous n row contains a erroneous format when using the COPY command or importing data from an external table.

  • If a valid row is processed in the previous n row, the YMatrix database continues to process the input row.
  • Setting the value to 0 will disable this limit.
  • You can also specify the SEGMENT REJECT LIMIT clause for the COPY command or external table definition to limit the number of rejected rows.
  • INT_MAX is the maximum value that can be stored as an integer in the system.
Data Type Default Value Value Range Set Classification
int 1000 0 ~ INT_MAX master; session; reload

gp_reject_percent_threshold


For single-row error handling on COPY and external table SELECT, set the number of rows processed before SEGMENT REJECT LIMIT n PERCENT begins.

Data Type Default Value Value Range Set Classification
int 300 0 ~ INT_MAX master; session; reload

mx_discard_gpfdist_data


Discard packets from the gpfdist v1 protocol.

Data Type Default Set Category
boolean off master; session; reload

readable_external_table_timeout


When an SQL query is read from an external table, the parameter value specifies the time (in seconds) the YMatrix database waits before canceling the query when the data stops returning from the external table.- Default value 0, specifying no timeout. The YMatrix database does not cancel the query.

  • If a query using gpfdist runs for a long time and returns an error Intermittent network connection problem, you can set the value for readable_external_table_timeout. If gpfdist does not return any data within the specified time, the YMatrix database cancels the query.
Data Type Default Value Value Range Set Classification
int 0 0 ~ INT_MAX master; system; reload

writable_external_table_bufsize


The buffer size (KB) of YMatrix databases used for network communication, such as the gpfdist utility and external web tables (using HTTP).

  • YMatrix database stores data in a buffer before writing it.
  • The value range is 32KB ~ 131072KB (i.e. 128MB).
Data Type Default Value Value Range Set Classification
int 64 32 ~ 131072 segments; session; reload

verify_gpfdists_cert


This parameter controls whether SSL certificate authentication is enabled when using the gpfdists protocol to define external tables of the YMatrix database to use SSL security.

  • The default setting is on, and SSL authentication is enabled when the YMatrix database communicates with the gpfdist utility to read data from or write data to an external data source.
  • The value off will disable SSL certificate authentication. Ignore these SSL exceptions:
    • YMatrix database does not trust the self-signed SSL certificate used by gpfdist.
    • The host name contained in the SSL certificate does not match the host name running gpfdist.
  • You can set the value to off to disable authentication when testing communication between an external table of the YMatrix database and the gpfdist utility that provides external data.
  • Note: Disabling SSL certificate authentication by not verifying the gpfdists SSL certificate will expose security risks.
Data Type Default Set Category
boolean on master; session; reload