This document describes how YMatrix supports automatic data tiering to object storage (such as AWS S3), enabling fully automated hot and cold data tiering.
Note!
This feature is only supported for MARS3 tables.
IoT scenarios generate massive amounts of data. While YMatrix emphasizes the importance of hot data (real-time data), it also focuses on optimizing storage performance and usability for cold data (historical data). For example, transaction orders on e-commerce platforms are frequently accessed within 7 days, but access frequency drops significantly after the return/refund period expires. However, these expired orders still consume substantial storage space. In such cases, reducing storage costs becomes a critical concern.
To address cold data storage, YMatrix has developed a data degradation feature. It enables storing hot and cold data on different media: hot data remains on high-performance storage such as standard storage, local SSDs, or HDDs, while cold data is automatically moved to cost-effective capacity-tier storage like object storage. Capacity-tier storage costs only 20% of standard storage, significantly lowering overall storage expenses. Once migrated, this data seamlessly integrates with YMatrix's powerful Massively Parallel Processing (MPP) architecture, achieving efficient data tiering and storage optimization.
Beyond direct storage cost savings, the data degradation process requires no manual intervention, minimizing operational and management overhead.
Most existing tiered storage solutions rely on FDW (Foreign Data Wrapper). In contrast, YMatrix treats object storage as an equivalent storage medium to local file systems and integrates it directly into the MARS3 storage engine. Capabilities available for regular file systems are equally applicable to degraded data.
This means MARS3 tiered storage offers greater flexibility:
| Feature | YMatrix (MARS3 Tiered Storage) | FDW-based Solutions |
|---|---|---|
| Transaction Support | Yes | No |
| Automatic Cold Data Migration | Yes | No |
| Management Complexity | No manual setup required | Requires complex operations |
This section walks you through a complete workflow to help you quickly experience the process and results of data tiering.
We use AWS S3 as an example.
First, log in to AWS, click your username at the top right, select Security credentials, then scroll down and choose Sign out.
Since we need to migrate cold data from YMatrix to an AWS S3 bucket, select Applications running outside of Amazon Web Services.
Creation successful.
After creating the bucket, collect the following information:
https://<your_bucket>.s3.<your_region>.amazonaws.com.cnCreate a tablespace directory on both the Master and Segment servers of your test cluster.
$ mkdir -p <tablespace_dir,eg:'/home/mxadmin/test'>
Use vi quick.sql to create a file named quick.sql with the following content:
-- Create matrixts extension
CREATE EXTENSION IF NOT EXISTS matrixts;
-- Configure system parameters
\!gpconfig -c mars3.enable_objectstore -v ON --skipvalidation
\!gpconfig -c mars3.degrade_min_run_size -v 0 --skipvalidation
\!gpconfig -c mars3.degrade_probe_interval -v 300 --skipvalidation
-- Create tablespace your_tbs
DROP TABLESPACE IF EXISTS your_tbs;
CREATE TABLESPACE your_tbs LOCATION :tablespace_dir
WITH(
oss_accessid=:oss_accessid,
oss_secretkey=:oss_secretkey,
oss_endpoint=:oss_endpoint,
oss_region=:oss_region,
oss_bucket=:oss_bucket
);
Replace configuration values in < > (including < >) and execute:
$ psql -be -d <your_database>
-v tablespace_dir="'<your_tablespace_dir>'" \
-v oss_accessid="'<your_oss_accessid>'" \
-v oss_secretkey="'<your_oss_secretkey>'" \
-v oss_endpoint="'<your_oss_endpoint>'" \
-v oss_region="'<your_oss_region>'" \
-v oss_bucket="'<your_oss_bucket>'" \
-f quick.sql
Reload configuration:
$ mxstop -u
Use vi quick2.sql to create quick2.sql containing:
\set tname t1
\set tnamestr '''t1'''
-- Create test table; configure migration of data older than 7 days
DROP TABLE IF EXISTS :tname;
CREATE TABLE IF NOT EXISTS
:tname(
key int,
value int,
c3 date)
USING mars3
WITH(ttl_interval='7 d', ttl_space='your_tbs')
DISTRIBUTED BY (key)
ORDER BY (key, value);
-- Insert test data
INSERT INTO :tname SELECT i, i, current_date FROM generate_series(1, 12000000) i;
Run quick2.sql:
$ psql -be -d <your_database> -f quick2.sql
Use the built-in function to manually migrate all data to object storage:
$ psql <your_database>
=# SELECT matrixts_internal.mars3_degrade('t1'::regclass, -1);
After executing the above command, all data from t1 should be stored in object storage. The cluster retains only a "shell" for accessing the data. Verify access using:
=# \set tname t1
SELECT * FROM :tname LIMIT 10;
DELETE FROM :tname WHERE key < 50;
SELECT * FROM :tname WHERE key < 50;
You can also check the object storage console to confirm data presence. Try deleting the migrated objects and re-accessing them:
=# SELECT * FROM :tname WHERE key < 50;
The database will return error key not exist, confirming success: the accessed data now resides entirely in object storage.
| Feature | Description | Configuration Parameters |
|---|---|---|
| Authentication Modes | Supports two authentication methods: Static AccessId/SecretKey Dynamic AccessId/SecretKey via Security Token Service (STS) |
Tablespace creation parameters System parameter mars3.degrade_credential_update_interval |
| Migration Policy Management | Controls migration policies including criteria for identifying historical data and interval for background checks | ttl_interval/ttl_space/mars3.enable_objectstore/mars3.degrade_probe_interval |
| Query Caching | Improves I/O efficiency and query performance by caching data retrieved from object storage | matrixts.enable_object_cache/mars3.enable_object_prefetch/oss_cache_size |
| Operations Tools | Link verification tool: checks if environment meets prerequisites Garbage scanner: identifies and allows manual cleanup of orphaned objects |
YMatrix supports two authentication modes—choose one:
The quick start uses static credentials (AWS S3):
=# CREATE TABLESPACE <your_tablespace> LOCATION '<your_tablespace_dir>'
WITH(
oss_accessid='<your_accessid>',
oss_secretkey='<your_secretkey>',
oss_endpoint='<your_address>',
oss_bucket='<your_bucket>'
);
For STS mode, you must provide additional script to refresh credentials periodically:
-- Create tablespace
CREATE TABLESPACE <your_tablespace> LOCATION '<your_tablespace_dir>'
WITH(
oss_endpoint='<your_address>',
oss_bucket='<your_bucket>'
);
-- Script to update oss_sessiontoken/oss_accessid/oss_secretkey regularly
DB="postgres"
SPACE="tbs"
RES=$(aws sts get-session-token)
if [ $? != 0 ]
then
echo "get credentials fail!"
exit 1
fi
ak=$(echo "${RES}"|jq -r .Credentials.AccessKeyId)
sk=$(echo "${RES}"|jq -r .Credentials.SecretAccessKey)
token=$(echo "${RES}"|jq -r .Credentials.SessionToken)
if [ "${ak}" == "" ]
then
echo "AccessKeyId is empty!"
exit 1
fi
if [ "${sk}" == "" ]
then
echo "SecretAccessKey is empty!"
exit 1
fi
if [ "${token}" == "" ]
then
echo "SessionToken is empty!"
exit 1
fi
echo $ak
echo $sk
echo $token
psql -d "${DB}" -c "ALTER TABLESPACE ${SPACE} SET (oss_accessid='${ak}', oss_secretkey='${sk}', oss_sessiontoken='${token}');"
if [ $? != 0 ]
then
echo "alter tablespace fail!"
exit 1
fi
To schedule periodic credential updates, set:
Note!
See CREATE TABLESPACE for details on tablespace parameters.
Enable data migration first using:
Example:
$ gpconfig -c mars3.enable_objectstore -v ON --skipvalidation
$ gpconfig -c mars3.degrade_probe_interval -v 300 --skipvalidation
$ mxstop -u
Set migration policy during table creation using ttl_interval and ttl_space:
=# CREATE TABLE <your_table> (
...
)
...
WITH (ttl_interval='<your_interval>', ttl_space='<your_tablebspace>')
...
;
Modify policy on existing tables:
=# ALTER TABLE <your_table> SET (ttl_interval='14d');
Caching reduces I/O overhead when querying cold data from object storage, improving performance. Controlled by:
Example: Set 10,000MB cache per tablespace:
=# CREATE TABLESPACE <your_tablebspace> LOCATION <your_tablespace_dir>
WITH(
...
oss_cache_size=10000,
...
);
Enable caching in session:
=# SET matrixts.enable_object_cache TO ON;
=# SET mars3.enable_object_prefetch TO ON;
Note!
Each backend process requires 64MB memory for file cache and 64MB for memory cache. A query with 10 segments may require up to 1280MB extra memory. For optimal performance, enable both caches.
Before enabling tiered storage on a production cluster, use the link checker to verify environmental compatibility.
The osscheck tool is located in the $GPHOME/bin directory.
Steps:
Generate config template:
$ osscheck -g conf
Edit configuration file
Required fields: oss_accessid, oss_secretkey, oss_endpoint, oss_bucket
For STS: include oss_sessiontoken
Optional prefix: oss_keyprefix
Template (replace values in < >, including < >):
oss_accessid='<your_oss_accessid>'
oss_secretkey='<your_oss_secretkey>'
oss_sessiontoken='<your_oss_sessiontoken>'
oss_endpoint='<your_oss_endpoint>'
oss_region='<your_oss_region>'
oss_bucket='<your_oss_bucket>'
oss_keyprefix='<your_oss_keyprefix>'
oss_retry_mode=default
oss_upload_partsize=5
oss_upload_parallelnum=3
oss_retry_times=3
oss_timeout=3000
$ osscheck -t conf
put object success
get object success
list object success
delete object success
Success shows green success. Failure shows red fail:
$ osscheck -t conf.template
ERROR: InvalidAccessKeyId: The Access Key Id you provided does not exist in our records.
put object fail
Identifies unexpected orphaned objects in object storage.
Causes of garbage data:
Notes:
matrixts_internal.oss_junkobjects() to query catalogoss_bucket/oss_keyprefix/ pathUsage:
=# CREATE EXTENSION IF NOT EXISTS matrixts;
=# SELECT * FROM matrixts_internal.oss_junkobjects('<your_tablespace>');
key | lastmodify | size
-------------------------------------+------------------------+----------
49948/2/52751/4/0 | 2023-10-19 16:00:14+08 | 38964
49948/2/53021/1/0 | 2023-10-19 16:04:55+08 | 38964
16384/2/158581/3/0 | 2023-09-03 13:16:26+08 | 4080
16384/2/158581/p2 | 2023-09-03 15:18:36+08 | 5
16384/2/202151/3/0 | 2023-09-21 16:22:09+08 | 80
49948/1/52751/4/0 | 2023-10-19 16:00:14+08 | 40620
49948/1/52861/1/0 | 2023-10-19 16:01:34+08 | 240000
49948/1/53021/1/0 | 2023-10-19 16:04:55+08 | 40620
49948/1/53038/1/0 | 2023-10-19 16:05:10+08 | 120000
16384/1/158581/3/0 | 2023-09-03 13:16:26+08 | 3864
16384/1/202151/3/0 | 2023-09-21 16:22:09+08 | 20
49948/0/52751/4/0 | 2023-10-19 16:00:14+08 | 40416
49948/0/53021/1/0 | 2023-10-19 16:04:55+08 | 40416
......
Fields:
Yes. The time column must be part of the partition key and be of type date or timestamp. Time is used to determine cold data eligibility. Currently, tables without a time-based partition key cannot use tiering.
Note!
Tables with multiple time columns in the partition key are not supported.
Default is every hour. Can be adjusted via mars3.degrade_probe_interval.
Set and reload:
$ gpconfig -c mars3.enable_objectstore -v off --skipvalidation
$ mxstop -u
Theoretically unlimited, but cache performance degrades with multiple tablespaces. Recommend one object storage tablespace per cluster.
Yes, it is shared.
Yes, either can be used independently. However, for best performance, enable both.
No. Child partitions inherit settings only upon creation or addition. Existing parent tables altered with ALTER TABLE do not propagate changes to existing children.
"Additional memory" refers to: base memory + auto-tiering enabled + concurrent queries + both caches enabled.
Note!
This estimates a theoretical worst-case scenario.
Formula:
Additional memory =
Concurrent client queries ×
Segments per node ×
Average tables scanned per query ×
(Parallel Workers + 1) ×
(64MB + 64MB + 10MB) +
Number of partitioned tables ×
Segments per node ×
64MB