This document describes the cluster parallel backup utility mxbackup.
mxbackup performs parallel backups of a YMatrix cluster. The backed-up data must be restored using mxrestore.mxbackup supports the S3 object storage plugin, allowing data to be streamed directly to S3 without additional I/O overhead. To use S3 for backup and restore, you must prepare the required account and bucket permissions, and create a properly configured YAML file. See below for configuration parameters and usage instructions.| Parameter | Description |
|---|---|
--backup-dir *directory* |
Absolute path where backup files are written |
--compression-level *level* |
Compression level (1–9), default is 1 |
--data-only |
Back up only data, not schema |
--dbname *db* |
Database to back up |
--debug |
Output debug-level log messages |
--exclude-schema *schema* |
Schema to exclude from backup; can be specified multiple times |
--exclude-schema-file *file* |
File containing a list of schemas to exclude |
--exclude-table *table* |
Table to exclude from backup; can be specified multiple times |
--exclude-table-file *file* |
File containing a list of tables to exclude |
--from-timestamp *timestamp* |
Start timestamp for incremental backup; must be used with --incremental |
--help |
Show help information |
--history |
Display historical timestamps in the current backup directory |
--include-schema *schema* |
Schema to include in backup; can be specified multiple times |
--include-schema-file *file* |
File containing a list of schemas to include |
--include-table *table* |
Table to include in backup; can be specified multiple times |
--include-table-file *file* |
File containing a list of tables to include |
--incremental |
Perform incremental backup (for AO tables only); requires --from-timestamp |
--jobs *num* |
Number of concurrent connections during backup; default is 1 |
--leaf-partition-data |
Create separate data files for each leaf partition in partitioned tables |
--metadata-only |
Back up only metadata, not table data |
--no-compression |
Do not compress table data |
--plugin-config *file* |
Specify the location of the plugin configuration file |
--quiet |
Suppress non-warning, non-error log messages |
--single-data-file |
Write all data to a single file instead of one file per table |
--verbose |
Print detailed log messages |
--version |
Print tool version and exit |
--with-stats |
Include statistics in the backup |
--without-globals |
Exclude global metadata from backup |
| Parameter | Description | Required |
|---|---|---|
executablepath |
Absolute path to the S3 storage plugin | Yes |
region |
Cloud platform region; ignored if endpoint is set |
Yes |
aws_access_key_id |
S3 access key ID for the cloud platform | Yes |
aws_secret_access_key |
Secret key for the S3 access key ID | Yes |
bucket |
S3 bucket used to store mxbackup data files | Yes |
endpoint |
S3 endpoint URL | No |
encryption |
Enable SSL encryption for S3. Valid values: on, off; default is on |
No |
http_proxy |
HTTP proxy server URL for connecting to S3 | No |
backup_max_concurrent_requests |
Maximum concurrent requests for mxbackup; default is 6 | No |
backup_multipart_chunksize |
Maximum buffer/chunk size for mxbackup; default is 500MB | No |
restore_max_concurrent_requests |
Maximum concurrent requests for mxrestore; default is 6 | No |
restore_multipart_chunksize |
Maximum buffer/chunk size for mxrestore; default is 500MB | No |
Below is a sample configuration file template. Select the required parameters and replace the content within < > or [ ] (including the symbols) with actual values.
$ executablepath: <absolute-path-to-gpbackup_s3_plugin>
options:
region: <云平台区域>
endpoint: <S3 终端节点>
aws_access_key_id: <用户 ID>
aws_secret_access_key: <用户密钥>
bucket: <S3 存储桶>
folder: <S3 上存储数据的文件目录>
encryption: [on|off]
backup_max_concurrent_requests: [int]
backup_multipart_chunksize: [string]
restore_max_concurrent_requests: [int]
restore_multipart_chunksize: [string]
http_proxy:
<http://<用户名>:<安全密钥>@proxy.<域名>.com:端口号>
In the following examples, the database name is demo, and the schema name is twitter.
Back up the entire database:
$ mxbackup --dbname demo
Back up the demo database, excluding the twitter schema:
$ mxbackup --dbname demo --exclude-schema twitter
Back up only the twitter schema in the demo database:
$ mxbackup --dbname demo --include-schema twitter
Back up the demo database and store backup files in the /home/mxadmin/backup directory:
$ mxbackup --dbname demo --backup-dir /home/mxadmin/backup
Before using S3 for backup and restore, ensure you have the necessary cloud account and bucket (Bucket) permissions. Required permissions include, but are not limited to:
First, prepare the S3 plugin configuration file s3-config-file.yaml. This example includes common parameters. For a full list, refer to section 2.2 above.
$ executablepath: $GPHOME/bin/mxbackup_s3_plugin # Absolute path to S3 plugin
options:
region: us-west-2 # Cloud platform region
aws_access_key_id: test-s3-user # S3 login ID
aws_secret_access_key: asdf1234asdf # S3 login key
bucket: matrixdb-backup # S3 bucket
folder: backup3 # Directory name in S3 object storage
Then, use mxbackup to perform a parallel backup of the demo database:
$ mxbackup --dbname demo --plugin-config /tmp/s3-config-file.yaml
After a successful backup, mxbackup creates a timestamped directory in the S3 object storage. You can use mxrestore to restore the data from S3.
Example S3 backup path:
$ backup3/backups/20221208/20221208185654
Note!
The log file directory formxbackupis<gpadmin_home>/gpAdminLogs/gpbackup_s3_plugin_timestamp.log, with timestamps in the formatYYYYMMDDHHMMSS.
Note!
For more information about backup and restore operations in YMatrix, see Backup and Restore. For details about the restore tool, see mxrestore.