Automated partition management

I mentioned earlier that hot and cold storage is divided into hot and cold storage. In a production environment, data partitioning is done according to time. MatrixDB provides some UDFs used to manage partitions, and users can manually perform partition maintenance according to their needs. For example: one partition every day, one partition every 8 hours, etc.

But time is an infinitely extended dimension. When creating partition tables, you usually just create partitions in the future period in advance, and then add new time period partitions regularly. This undoubtedly increases operation and maintenance costs.

Automated partition management is to automate the work of manually maintaining partitions. include:

  1. Automatic partition creation and deletion
  2. Partition data is hot to cold
  3. Default partitioning automatically

The automated partition management feature is included in the matrixts extension, so first create the matrixts extension:

CREATE EXTENSION matrixts;

By calling set_policy, setting a policy for the target table, partition maintenance automation is achieved:

Currently, there are three policies as follows:

  1. auto_partitioning
  2. mars_time_series
  3. auto_splitting

1. auto_partitioning

The auto_partitioning strategy is applied to the automated management of ordinary partition tables, and implements the following two automation functions:

  1. New partition creation
  2. Expired partition cleaning

Example, create test partition table metrics, as follows:

CREATE TABLE metrics(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
Distributed by (tag_id)
Partition by range(time);

Create an auto_partitioning automatic partitioning policy for metrics table:

SELECT set_policy(
    'metrics',
    'auto_partitioning'
);

If you need to carry schema information, just pass the parameter schema_name.table_name.

After setting the policy, call list_policy to view the policy:

SELECT * FROM list_policy('metrics');

 reloid | relname | class_id |    class_name     |   action    | seq | disabled |               check_func               |      check_args      |                    act_func                    |       act_args        | version
--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  17798 | metrics |        2 | auto_partitioning | retention   |   1 | t        | matrixts_internal.apm_generic_expired  | {"after": "1 year"}  | matrixts_internal.apm_generic_drop_partition   | {}                    | 1.0
  17798 | metrics |        2 | auto_partitioning | auto_create |   2 | f        | matrixts_internal.apm_generic_incoming | {"before": "3 days"} | matrixts_internal.apm_generic_append_partition | {"period": "8 hours"} | 1.0
(2 rows)

As you can see, auto_partitioning contains two operations in total:

  1. Retention: Delete expired partitions
  2. auto_create: new partition creation

The detection parameter of retention is {"after": "1 year"}, which means that the partition has been more than one year and then delete it. The detection parameter of auto_create is {"before": "3 days"}, which means to check whether a new partition needs to be created 3 days in advance; the execution parameter is {"period": "8 hours"}, which means to create a new partition for 8 hours. This is the default setting for the auto_partitioning policy.

If you want to adjust these parameters, you need to call set_policy_action.

Example, adjust the retention to automatically delete it after 3 months; adjust the auto_create to 2 days in advance, with a time span of 1 day:

SELECT set_policy_action(
    'metrics',
    'retention',
    '3 months'
);

SELECT set_policy_action(
    'metrics',
    'auto_create',
    '{
        "before": "2 days",
        "period": "1 day"
    }'
);

Note: For an action with only one parameter, just write the target value string directly, such as retention. For actions containing multiple parameters, the target value needs to be completed through json splicing.

2. mars_time_series

The mars_time_series strategy is applied to the automated management of the timing partition table of the Heap+Mars1 storage solution, and realizes the following three automation functions:

  1. New partition creation
  2. Expired partition cleaning
  3. To cool the partition

Before creating mars_time_series for partitions, first create a mars extension and create a template for the partition table:

CREATE EXTENSION mars;
SELECT mars.build_timeseries_table('metrics','tagkey="tag_id", timekey="time", timebucket="1 day"');

Create mars_time_series automatic partitioning policy for metrics table:

SELECT set_policy(
    'metrics',
    'mars_time_series'
);

After setting the policy, call list_policy to view the policy:

SELECT * FROM list_policy('metrics');
 reloid | relname | class_id |    class_name    |   action    | seq | disabled |               check_func               |                  check_args                   |   act_func                                    |       act_args        | version
--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  17798 | metrics |        3 | mars_time_series | retention   |   1 | t        | matrixts_internal.apm_generic_expired  | {"after": "1 year"}                           | matrixts_internal.apm_generic_drop_partition  | {}                    | 1.0
  17798 | metrics |        3 | mars_time_series | auto_create |   2 | f        | matrixts_internal.apm_generic_incoming | {"before": "3 days"}                          | matrixts_internal.apm_mars_append_partition   | {"period": "8 hours"} | 1.0
  17798 | metrics |        3 | mars_time_series | compress    |   3 | f        | matrixts_internal.apm_expired_with_am  | {"after": "7 days", "storage_is_not": "mars"} | matrixts_internal.apm_mars_compress_partition | {}                    | 1.0
(3 rows)

As you can see, mars_time_series has an additional operation based on auto_partitioning: compress. Used to heat and cool partitions. The detection parameters are {"after": "7 days", "storage_is_not": "mars"}, which means that the partitions that are not mars are converted after 7 days.

The parameters need to be adjusted, and the `set_policyaction is also called.

Example, adjust compress to convert after 3 days:

SELECT set_policy_action(
    'metrics',
    'compress',
    '{
        "after": "3 days",
        "storage_is_not": "mars"
    }'
);

3. auto_splitting

The auto_splitting policy implements the default partition cutting operation for partition tables containing default partitions.

Example, create a partition table and create a default partition for it:

CREATE TABLE metrics(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
Distributed by (tag_id)
Partition by range(time);

CREATE TABLE metrics_others PARTITION OF metrics DEFAULT;

Set the auto_splitting policy:

SELECT set_policy(
    'metrics',
    'auto_splitting'
);

View the policy:

SELECT * FROM list_policy('metrics');
 reloid | relname | class_id |   class_name   |   action   | seq | disabled |                check_func                |        check_args        |                       act_func                        |       act_args        | version
--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  17798 | metrics |        1 | auto_splitting | retention  |   1 | t        | matrixts_internal.apm_generic_expired    | {"after": "1 year"}      | matrixts_internal.apm_generic_drop_partition          | {}                    | 1.0
  17798 | metrics |        1 | auto_splitting | auto_split |   2 | f        | matrixts_internal.apm_generic_older_than | {"age": "1 month 2days"} | matrixts_internal.apm_generic_split_default_partition | {"period": "1 month"} | 1.0
(2 rows)

As you can see, the auto_splitting policy contains 2 operations, deletion of expired partitions and automatic cutting of auto_split of default partitions.

Among them, {"age": "1 month 2days"} means that the oldest data in the default partition is earlier than January 2 days ago, and the current time is cut. {"period": "1 month"} means that the partition granularity after cutting is 1 month.

4. Operation status settings

From the above list_policy result you will notice that it contains a column called disabled. This column is used to set the operation status. No matter which strategy currently exists, the status of the retention operation is disabled.

To enable this operation, call enable_policy_action:

SELECT enable_policy_action('metrics', 'retention');

To disable an action, call disable_policy_action:

SELECT disable_policy_action('metrics', 'auto_split');

5. Delete Policy

drop_policy is the partition table deletion policy:

SELECT drop_policy('metrics');

Note: Each partition table can only set one partition policy, so the deletion policy only provides partition table name.