mxbackup

This document describes the cluster parallel backup utility mxbackup.

1 Description

  • mxbackup performs parallel backups of a YMatrix cluster. The backed-up data must be restored using mxrestore.
  • mxbackup supports the S3 object storage plugin, allowing data to be streamed directly to S3 without additional I/O overhead. To use S3 for backup and restore, you must prepare the required account and bucket permissions, and create a properly configured YAML file. See below for configuration parameters and usage instructions.

2 Parameter Information

2.1 mxbackup Command-Line Options

Parameter Description
--backup-dir *directory* Absolute path where backup files are written
--compression-level *level* Compression level (1–9), default is 1
--data-only Back up only data, not schema
--dbname *db* Database to back up
--debug Output debug-level log messages
--exclude-schema *schema* Schema to exclude from backup; can be specified multiple times
--exclude-schema-file *file* File containing a list of schemas to exclude
--exclude-table *table* Table to exclude from backup; can be specified multiple times
--exclude-table-file *file* File containing a list of tables to exclude
--from-timestamp *timestamp* Start timestamp for incremental backup; must be used with --incremental
--help Show help information
--history Display historical timestamps in the current backup directory
--include-schema *schema* Schema to include in backup; can be specified multiple times
--include-schema-file *file* File containing a list of schemas to include
--include-table *table* Table to include in backup; can be specified multiple times
--include-table-file *file* File containing a list of tables to include
--incremental Perform incremental backup (for AO tables only); requires --from-timestamp
--jobs *num* Number of concurrent connections during backup; default is 1
--leaf-partition-data Create separate data files for each leaf partition in partitioned tables
--metadata-only Back up only metadata, not table data
--no-compression Do not compress table data
--plugin-config *file* Specify the location of the plugin configuration file
--quiet Suppress non-warning, non-error log messages
--single-data-file Write all data to a single file instead of one file per table
--verbose Print detailed log messages
--version Print tool version and exit
--with-stats Include statistics in the backup
--without-globals Exclude global metadata from backup

2.2 S3 Object Storage Plugin Configuration File Parameters

Parameter Description Required
executablepath Absolute path to the S3 storage plugin Yes
region Cloud platform region; ignored if endpoint is set Yes
aws_access_key_id S3 access key ID for the cloud platform Yes
aws_secret_access_key Secret key for the S3 access key ID Yes
bucket S3 bucket used to store mxbackup data files Yes
endpoint S3 endpoint URL No
encryption Enable SSL encryption for S3. Valid values: on, off; default is on No
http_proxy HTTP proxy server URL for connecting to S3 No
backup_max_concurrent_requests Maximum concurrent requests for mxbackup; default is 6 No
backup_multipart_chunksize Maximum buffer/chunk size for mxbackup; default is 500MB No
restore_max_concurrent_requests Maximum concurrent requests for mxrestore; default is 6 No
restore_multipart_chunksize Maximum buffer/chunk size for mxrestore; default is 500MB No

Below is a sample configuration file template. Select the required parameters and replace the content within < > or [ ] (including the symbols) with actual values.

$ executablepath: <absolute-path-to-gpbackup_s3_plugin>
  options: 
    region: <云平台区域>
    endpoint: <S3 终端节点>
    aws_access_key_id: <用户 ID>
    aws_secret_access_key: <用户密钥>
    bucket: <S3 存储桶>
    folder: <S3 上存储数据的文件目录>
    encryption: [on|off]
    backup_max_concurrent_requests: [int]
    backup_multipart_chunksize: [string] 
    restore_max_concurrent_requests: [int]
    restore_multipart_chunksize: [string] 
    http_proxy:
        <http://<用户名>:<安全密钥>@proxy.<域名>.com:端口号>

3 Examples

3.1 Basic mxbackup Usage

In the following examples, the database name is demo, and the schema name is twitter.

Back up the entire database:

$ mxbackup --dbname demo

Back up the demo database, excluding the twitter schema:

$ mxbackup --dbname demo --exclude-schema twitter

Back up only the twitter schema in the demo database:

$ mxbackup --dbname demo --include-schema twitter

Back up the demo database and store backup files in the /home/mxadmin/backup directory:

$ mxbackup --dbname demo --backup-dir /home/mxadmin/backup

3.2 Using the S3 Object Storage Plugin

3.2.1 Prerequisites

Before using S3 for backup and restore, ensure you have the necessary cloud account and bucket (Bucket) permissions. Required permissions include, but are not limited to:

  • Upload and delete files in S3.
  • Open, browse, and download files from S3.

3.2.2 Usage Example

First, prepare the S3 plugin configuration file s3-config-file.yaml. This example includes common parameters. For a full list, refer to section 2.2 above.

$ executablepath: $GPHOME/bin/mxbackup_s3_plugin # Absolute path to S3 plugin
options: 
  region: us-west-2  # Cloud platform region
  aws_access_key_id: test-s3-user # S3 login ID
  aws_secret_access_key: asdf1234asdf # S3 login key
  bucket: matrixdb-backup # S3 bucket
  folder: backup3 # Directory name in S3 object storage

Then, use mxbackup to perform a parallel backup of the demo database:

$ mxbackup --dbname demo --plugin-config /tmp/s3-config-file.yaml

After a successful backup, mxbackup creates a timestamped directory in the S3 object storage. You can use mxrestore to restore the data from S3.

Example S3 backup path:

$ backup3/backups/20221208/20221208185654

Note!
The log file directory for mxbackup is <gpadmin_home>/gpAdminLogs/gpbackup_s3_plugin_timestamp.log, with timestamps in the format YYYYMMDDHHMMSS.

Note!
For more information about backup and restore operations in YMatrix, see Backup and Restore. For details about the restore tool, see mxrestore.