mxbackup

This document describes the cluster parallel backup utility mxbackup.

1 Description

  • mxbackup performs parallel backups of a YMatrix cluster. Backed-up data must be restored using mxrestore.
  • mxbackup supports the S3 object storage plugin, allowing data to be streamed directly to S3 without additional I/O overhead. To use S3 for backup and restore, prepare your account credentials, bucket permissions, and create a properly configured YAML file. See below for configuration details and usage instructions.

2 Parameter Information

2.1 mxbackup Command-Line Options

Parameter Description
--backup-dir *directory* Absolute path where backup files are written
--compression-level *level* Compression level (1–9), default is 1
--data-only Back up only table data, not schema definitions
--dbname *db* Database to back up
--debug Output debug-level log messages
--exclude-schema *schema* Exclude specified schema from backup; can be used multiple times
--exclude-schema-file *file* File containing list of schemas to exclude
--exclude-table *table* Exclude specified table from backup; can be used multiple times
--exclude-table-file *file* File containing list of tables to exclude
--from-timestamp *timestamp* Start timestamp for incremental backup; must be used with --incremental
--help Show help message
--history List historical timestamps in the current backup directory
--include-schema *schema* Include only specified schema; can be used multiple times
--include-schema-file *file* File containing list of schemas to include
--include-table *table* Include only specified table; can be used multiple times
--include-table-file *file* File containing list of tables to include
--incremental Perform incremental backup (AO tables only); requires --from-timestamp
--jobs *num* Number of concurrent connections during backup, default is 1
--leaf-partition-data Create separate data files for each leaf partition of a partitioned table
--metadata-only Back up only metadata, not table data
--no-compression Do not compress table data
--plugin-config *file* Path to the plugin configuration file
--quiet Suppress non-warning, non-error log messages
--single-data-file Write all data to a single file instead of one file per table
--verbose Print detailed log messages
--version Print version number and exit
--with-stats Include statistics in the backup
--without-globals Do not back up global metadata (e.g., roles, tablespaces)

2.2 S3 Object Storage Plugin Configuration File Parameters

Parameter Description Required
executablepath Absolute path to the S3 storage plugin executable Yes
region Cloud platform region; ignored if endpoint is set Yes
aws_access_key_id S3 access key ID for authenticating to the bucket Yes
aws_secret_access_key Secret key for the S3 access key ID Yes
bucket S3 bucket used to store mxbackup files Yes
endpoint Custom S3 endpoint URL No
encryption Enable SSL encryption for S3 communication. Valid values: on, off. Default: on No
http_proxy HTTP proxy server URL for connecting to S3 No
backup_max_concurrent_requests Maximum number of concurrent backup requests. Default: 6 No
backup_multipart_chunksize Maximum buffer/chunk size for backup uploads. Default: 500MB No
restore_max_concurrent_requests Maximum number of concurrent restore requests. Default: 6 No
restore_multipart_chunksize Maximum buffer/chunk size for restore downloads. Default: 500MB No

Below is a sample configuration template. Select required parameters and replace content within "<>" or "[]" (including the brackets) with actual values.

$ executablepath: <absolute-path-to-gpbackup_s3_plugin>
  options: 
    region: <cloud-region>
    endpoint: <S3-endpoint>
    aws_access_key_id: <user-ID>
    aws_secret_access_key: <user-key>
    bucket: <S3-bucket-name>
    folder: <folder-path-on-S3>
    encryption: [on|off]
    backup_max_concurrent_requests: [int]
    backup_multipart_chunksize: [string] 
    restore_max_concurrent_requests: [int]
    restore_multipart_chunksize: [string] 
    http_proxy:
        <http://<username>:<password>@proxy.<domain>.com:port>

3 Examples

3.1 Basic mxbackup Usage

Assume the database name is demo and the schema name is twitter.

Back up the entire database:

$ mxbackup --dbname demo

Back up the demo database excluding the twitter schema:

$ mxbackup --dbname demo --exclude-schema twitter

Back up only the twitter schema in the demo database:

$ mxbackup --dbname demo --include-schema twitter

Back up the demo database to a specific directory: `/home/mxadmin/backup`` $ mxbackup --dbname demo --backup-dir /home/mxadmin/backup


### 3.2 Using the S3 Object Storage Plugin

#### 3.2.1 Prerequisites  
Before using S3 for backup and restore, ensure you have proper cloud account access and bucket permissions, including:
- Upload and delete objects in S3.
- List, open, and download files from the S3 bucket.

#### 3.2.2 Example Usage  
First, prepare the S3 plugin configuration file, e.g., `s3-config-file.yaml`. This example includes common settings. See section 2.2 for full parameter descriptions.

$ executablepath: $GPHOME/bin/mxbackup_s3_plugin # Absolute path to S3 plugin options: region: us-west-2 # Cloud region aws_access_key_id: test-s3-user # S3 login ID aws_secret_access_key: asdf1234asdf # S3 login key bucket: matrixdb-backup # S3 bucket name folder: backup3 # Directory name in S3


Then, run `mxbackup` to back up the `demo` database using the S3 plugin:

$ mxbackup --dbname demo --plugin-config /tmp/s3-config-file.yaml


After a successful backup, `mxbackup` creates a timestamped directory in S3. You can later restore this data using [mxrestore](/en/doc/4.8/tools/mxrestore).

Example S3 path after backup:

$ backup3/backups/20221208/20221208185654



>***Note!***  
The `mxbackup` log directory is located at `<gpadmin_home>/gpAdminLogs/gpbackup_s3_plugin_timestamp.log`, with timestamps in format `YYYYMMDDHHMMSS`.

>***Note!***  
For more information about backup and recovery in YMatrix, see [Backup and Restore](/en/doc/4.8/maintain/backup_restore). For details on restoration tools, refer to [mxrestore](/en/doc/4.8/tools/mxrestore).