This document describes best practices for creating tables in YMatrix.
| Rule | Required |
|---|---|
Database names, table names, column names, function names, view names, sequence names, and aliases must use lowercase letters, underscores, and digits only. The name must start with a lowercase letter, must not begin with pg, and must not exceed 63 characters in length |
Yes |
Column names must not use reserved keywords or SQL keywords such as oid, xmin, ctid, select, insert, update |
Yes |
Primary key index names should follow the pattern pk_字段名; unique index names should follow uk_字段名; regular index names should follow idx_字段名 |
Yes |
Partitioned tables should be named using the parent table name as prefix and partitioning rule as suffix. For example: parent table order, child tables order_p_202010, order_p_202009 |
Yes |
When a field represents 是否xx, it is recommended to name it using the format is_xxx and use BOOLEAN type, e.g., is_deleted |
No |
| It is best practice for database names to match application service names | No |
Table names should be clear and self-explanatory. Avoid ambiguous abbreviations and overly long names. Follow the pattern 业务名称_表的作用. Use a prefix for views and use tmp as prefix for temporary tables. Recommended: trade_config, payment_task. Avoid: tradeconfig, paymenttask |
No |
| Storage Mode | Suitable For |
|---|---|
| Row-oriented | Frequent updates or inserts Queries accessing multiple columns of a table simultaneously |
| Column-oriented | Tables with many columns but queries access only a few Aggregation operations on a small number of columns Frequent updates to individual columns without modifying others Recommended for most OLAP and time-series workloads |
YMatrix provides four storage engines: HEAP, AO, and MARS3. Choose the appropriate engine based on data volume and write performance requirements.
| Storage Engine | Supports | Suitable For | Unique Features |
|---|---|---|---|
| HEAP | Row storage | Tables requiring frequent updates Tables using row storage |
|
| AO | Row or column storage | Tables and partitioned tables with infrequent updates after initial load, where data is inserted in batch mode | Integration with vectorized execution engine |
| MARS3 | Hybrid row-column or column storage | Time-series scenarios OLTP scenarios with high-frequency, small-batch writes OLAP scenarios with low-frequency, large-batch writes |
Integration with vectorized execution engine Encoding chain compression Automatic storage tiering |
Note!
For more information about storage engines, see Storage Engine Overview.
Choosing precise data types helps reduce storage space, memory usage, and improve query efficiency. Consider the following:
int2 instead of int4 or int8 for an age column.int, float) while meeting business needs.YMatrix distributes data across segments to leverage parallel processing. Consider the following when selecting a distribution key:
SELECT gp_segment_id,count(*) FROM table_name GROUP BY gp_segment_id;.count(distinct 字段名) value, the better the column suits as a distribution key.WHERE clauses as distribution keys; consider using them as partition keys instead.YMatrix is a distributed database. Create partitions based on data volume. Key considerations:
Special notes for MARS3 table partition maintenance:
ALTER require full table locks and may block concurrent queries.Special notes for HEAP/AOCO table partition maintenance:
Note!
The above partition size recommendations apply to YMatrix 6.X.
For MARS3 tables, configure the compress_threshold parameter at the table level. Increasing this value reduces I/O frequency but may lower block-level filtering efficiency.
Compression is generally recommended, especially for column-oriented tables. It trades idle CPU resources for reduced I/O time, alleviating I/O pressure and improving query performance.
If using the MARS3 storage engine:
HEAP does not support explicit compression.
Indexes accelerate queries and are stored separately. This section covers indexing specifics.
For HEAP and AO tables, supported index methods include B-tree, BRIN, Bitmap, and Hash. Specialized indexes for text, spatial, and other data types include GIN, GiST, ODIN, SP-GiST, and TOAST.
MARS3 currently supports BRIN indexes. You may optionally create one or more mars3_brin indexes to improve query performance.
DISTRIBUTED REPLICATED parameter to define replicated tables.Master-only tables store data exclusively on the master node (MXMaster), not distributed to compute nodes (MXSegments).
DISTRIBUTED MASTERONLY clause. See CREATE TABLE for details.mxstart -am command to start the database in master-only mode.