DDL Best Practices

This document describes best practices for creating tables using YMatrix.

0 Naming Conventions

Specification Required
Database names, table names, column names, function names, view names, sequence names, aliases, etc., must use a combination of lowercase letters, underscores, and numbers, but the first letter must be lowercase, and they must not start with pg. Object names must not exceed 63 characters in length Yes
Field (or column) names must not use reserved field names or keywords, such as oid, xmin, ctid, select, insert, update, etc. Yes
Primary key index names are pk_column_name, unique index names are uk_column_name, and regular index names are idx_column_name Yes
Partitioned tables should use the parent table name as the prefix and the child table rule characteristics as the suffix, such as parent table order, child table order_p_202010, order_p_202009 Yes
When a field means “whether xx,” it is recommended to name it using the format is_xxx and select the Boolean type, such as is_deleted No
It is best to keep the database name consistent with the application service name No
Table names should be clear and meaningful, avoiding ambiguous abbreviations or overly lengthy names. It is recommended to follow the format business name_table purpose, with views prefixed and temporary tables prefixed with tmp. Recommended: trade_config, payment_task. Avoid: tradeconfig, paymenttask. No

1 row storage and column storage

Storage mode Applicable to
Row storage Data is frequently updated or inserted.
Need to access multiple fields of a table at the same time
column storage There are many fields, but most queries access only a few fields.
Perform aggregation operations on a small number of fields.
Individual fields in the table are updated periodically without modifying other fields.
For most OLAP business and time-series scenarios, it is recommended to prioritize column storage.

2 storage engine

YMatrix offers four storage engines: HEAP, AO and MARS3. We typically select the appropriate storage engine based on data scale and write performance requirements.

Storage Engine Supported Applicable to
HEAP Row storage Frequently updated tables
Select row-stored tables
AO Row storage/Column storage Tables and partitioned tables are not updated frequently after initial data loading, and data is only inserted in batches.
MARS3 Row-column hybrid storage/Column storage Time Series Scenario
OLTP scenarios involving high-frequency, small-batch data writes
OLAP scenarios involving low-frequency, large-volume data writes

Notes!
For more information about storage engines, see Storage Engine Overview.

3 Data Types

Precise data types are very helpful for storage space, memory usage, and query efficiency. The following considerations should be taken into account when selecting types:

  • Minimize. For integer or floating-point types, select the smallest type that can represent the specific range. For example, for an age field, use int2 instead of int4/int8.
  • If using a vectorized execution engine, prioritize types that better leverage vectorized performance potential while meeting business requirements, such as primitive types like int and float.

4 Distribution Key

YMatrix supports distributing data across different nodes to leverage distributed parallel processing capabilities and improve performance. The distribution key primarily considers the following factors:

  • Data uniformity. Evenly distributing data can effectively avoid the bottleneck effect and balance the load across different nodes. For example, a single column that can uniformly distribute data across all segments can be used as the distribution key. The column serving as the distribution key should have a large and uniform range of values; otherwise, data skew may occur. In special cases, random distribution can be considered to ensure absolutely uniform data distribution. The query statement for checking data distribution is as follows: SELECT gp_segment_id, count(*) FROM table_name GROUP BY gp_segment_id;.
  • JOIN convenience. For large tables that require joins, use the join fields as distribution keys whenever possible. This significantly improves performance through local joins and avoids data movement across nodes.
  • Discrete fields. Choose the field with the most discrete values in the table data as the distribution key. The larger the value of count(distinct field_name) (i.e., the number of unique values in this field), the more suitable the field is as a distribution key.
  • Avoid using columns from the WHERE clause of a query as distribution keys; instead, use them as partition keys.
  • Validate data after initial and incremental data loading to reduce data skew.
  • Do not use dates or timestamps as distribution keys.

5 Partition Settings

YMatrix is a distributed database that creates different partitions based on the amount of data. There are several key points to consider when setting up partitions:

  • Partition size. For time series scenarios, it is generally recommended to partition by year or month, rather than by day. In practice, overly detailed partition planning is inconvenient to maintain and detrimental to performance: with column storage, too many physical files can actually hinder query optimization.
  • Partition reservation. It is not necessary to reserve too many partitions in advance. You can create them before use. For example, if you partition by month, you can create 12 new partitions for the next year at the end of each year.
  • Partition maintenance cost. Time Series Scenarios often use time as the partition key. As time passes, you can back up and migrate data by partition. For historical partitions that are confirmed to have no data or no valid data, clean them up in a timely manner. See Routine Vacuuming for details.
  • Use partitions only when partition elimination (partition pruning) can be achieved based on query conditions. Prioritize range partitions over list partitions, and avoid distributing and partitioning a table on the same column.
  • Reduce the use of default partitions, do not use multi-level partitions, and prioritize creating fewer partitions so that each partition contains more data.

Notes on partition maintenance operations for MARS2/MARS3 tables:

  • Performing ALTER operations on partitions requires locking the entire table, which will cause mutual blocking between queries.
    • Querying too many partitions at once will significantly slow down the speed. Therefore, for MARS2/MARS3 tables, it is generally appropriate to control a single partition on a single segment to be around 5GB.

Notes on performing partition maintenance operations on HEAP/AOCO tables:

  • For HEAP tables, it is generally appropriate to control a single partition on a single segment to be around 5GB. At the same time, pay attention to controlling the index size.
  • For AOCO tables, it is generally appropriate to control a single partition on a segment to be around 3GB. At the same time, pay attention to controlling the index size.

Note!
The above recommended partition size values are applicable to YMatrix 5.X.

6 Block size

You can configure the compression threshold by specifying compress_threshold at the MARS3 table level. Increasing this value appropriately can reduce the number of I/O operations, but it may also lower the block filtering rate.

7 Compression and Non-Compression

Compressed storage is generally recommended, especially for column-store tables, where compressed storage is more effective. Compressed storage allows idle CPU resources to be used to replace read/write I/O time, alleviating I/O pressure and improving query efficiency.

If you use the MARS2 or MARS3 storage engine to create tables:

  • For compression types, YMatrix recommends using chain compression. It reduces the amount of data stored on disk and decreases the amount of data that needs to be read during queries.
  • Compression also incurs decompression overhead, so query optimization should prioritize algorithms with faster decompression speeds. YMatrix's proprietary algorithms, except for Gorilla, generally outperform LZ4 in decompression speed and can be selected based on data characteristics.

HEAP does not support explicit compression.

8 Indexes

Indexes are a technique used to accelerate queries. Index data is typically stored separately, so this section will discuss them separately.

For HEAP tables and AO tables, there are various indexing methods such as Btree, Brin, Bitmap, and Hash. For text and spatial data, there are additional indexing methods such as Gin, Gist, Odin, Spgist, and Toin.

MARS2 currently supports primary key indexes and Brin indexes. The former is the sort key, which is the index specified by mars2_btree, and is stored separately in the specified index file. The latter refers to the minmax index created for a specific column, which is stored in a separate metadata file. Notes:

  • If the sort key is a set of columns, please carefully decide the order of the sort key. The further back a column is in the sort key, the worse the filtering effect, but local sorting still has a positive effect on compression. For example, if the requirement is a single device point query, then the sort key is the device number (vin) in the time series scenario. If the requirement is a detailed query, aggregate query, or multi-device query for a single device within a certain time period, then the sort key is usually the device number and timestamp (vin,daq_time).
  • The selection of Brin indexes is closely related to the sorted results. If the number of tuples in the sorted column is already small, sorting may not achieve filtering effects and may even introduce negative overhead, so such columns should be excluded.

MARS3 currently supports Brin indexes. You can optionally create one or more mars3_brin indexes to improve query efficiency (similar to the minmax index for MARS2 tables).

9 Copy table

  • The data of the replicated table is saved on each segment, which reduces the cross-partition movement and redistribution of small table data.
  • Large tables are not suitable for using replication table distribution strategies, because replication of large table data to each node comes at a cost in terms of storage and maintenance.
    • For example, a scene detailed data table has billions of data, but the user permission table only has hundreds of pieces of data. Now you need to associate the scene detailed data table to the user permission table through query statements. If the two are directly related, hundreds of billions of data will be derived. Only by copying the user permission table to each Segement can such a large consumption be avoided.
  • The copy table is set using the DISTRIBUTED REPLICATED parameter.
  • In addition to the above functions, using this policy can also remove restrictions on actions that user-defined functions can perform on nodes.