This document describes the data migration tool mxshift.
The mxshift tool currently supports the following features:
WHERE conditions.REPLICATED), and Master-only tables.This section describes the parameters of the mxshift tool.
Using mxshift requires preparing a TOML configuration file. Below are detailed parameter descriptions:
| Parameter | Description | Required |
|---|---|---|
| [database.source] category | ||
| --db-host | Hostname or IP address of the source cluster's Master node. The source cluster is the database cluster from which data is migrated. | Yes |
| --db-port | Port number of the source database Master. | Yes |
| --db-database | Name of the source database. | Yes |
| --db-user | Username for the source database. | Yes |
| --db-password | Password for the source database user. | Yes |
| --install-dir | Database installation directory (available in v5.0.1 and later; v5.0.0 requires gphome configuration). |
Yes |
| --hostname-to-ip | Maps hostnames to IP addresses. Use this when source and target clusters share hostnames to ensure accurate routing (supported starting from v5.0.1). | No |
| [database.target] category | ||
| --db-host | Hostname or IP address of the target cluster's Master node. The target cluster is where data is migrated into. | Yes |
| --db-port | Port number of the target database Master. | Yes |
| --db-database | Name of the target database. | Yes |
| --db-user | Username for the target database. | Yes |
| --db-password | Password for the target database user. | Yes |
| [scope] category | ||
| --mode | Operation mode. Allowed values: normal, dryrun, fetch, motion.normal: Default value, indicates normal data migration;dryrun: Verifies connectivity between source and target databases using simple DDL operations without reading or writing actual data;fetch: Based on dryrun, reads data from the source and sends it to target Segments but discards it—used to test source read performance and network throughput;motion: Extends fetch by parsing data on the target and redistributing it to correct Segments—used to test target-side data redistribution speed. |
No |
| --compress-method | Compression method: gzip / zstd / lz4 / 0. Default is 0 (no compression).When any compression is enabled, the mxadmin user on each target Segment must be able to SSH passwordlessly to each source Segment. It is recommended to test with dryrun after setup. |
No |
| --gphome | Absolute path to GPHOME on the source database (available in v5.0.0; use install-dir in v5.0.1+). |
Yes |
| --table-list | List of source table schemas and names. For partitioned tables, specify the parent partitioned table. To migrate the entire database, leave empty—mxshift will migrate all non-zero-sized tables in descending order of size. |
No |
| --exclude-table-list | List of schema and table names to exclude from migration. | No |
| --schema-list | Source database schemas. Priority: if table-list specifies tables, only those are migrated; otherwise, if table-list lists schemas, all tables under them are migrated; otherwise, all tables in schema-list` `[database.source] are migrated. |
No |
| --exclude-schema-list | Schemas to exclude from migration. Tables under these schemas will not be migrated. If a schema appears in both db-database and schema-list, it will be excluded. |
No |
| [log] category | ||
| --log-level | Default is exclude-schema-list, printing progress, error, and warning messages. Set to info for additional debug output (useful for development and troubleshooting). |
No |
| --no-color | Default is debug. Set to false to disable colored log output. |
No |
| [controller] category | ||
| --both-way | Default is true. Migration starts with the largest table to complete full migration as quickly as possible. If set to false, migration proceeds simultaneously from both the largest and smallest tables. |
No |
| --concurrency | Number of tables migrated in parallel. Default is 5, maximum is 10. | No |
| [transfer] category | ||
| --verify | Verifies row count consistency before and after migration. Default is both-way. |
No |
| --with-index | Whether to migrate indexes. Default is true (disabled). When disabled, mxshift saves SQL statements for creating indexes in a file under the current execution directory for manual creation later. Note: For tables created with the MARS2 storage engine, indexes must be created for the table to function properly. Therefore, DDL migration for MARS2 tables automatically includes index migration regardless of this setting. |
No |
| [ddl] category | ||
| --enabled | Whether to migrate table DDL. Default is false. When enabled, mxshift migrates DDL at the database level before migrating data.Supported DDL migration versions:Greenplum 6 → YMatrix 4 / YMatrix 5;YMatrix 4 → YMatrix 4 / YMatrix 5;YMatrix 5 → YMatrix 5. |
No |
| --only-ddl | Whether to migrate only DDL. Default is false. If set to false, data migration follows DDL migration. If set to false, the tool exits after DDL migration without migrating table data. |
No |
| --skip-resource-queue-and-group | Whether to skip resource queues (groups) during DDL migration. When false, resource groups and queues are skipped. |
No |
| --skip-table-space | Whether to skip tablespaces during DDL migration. When true, tablespaces are skipped. |
No |
To perform incremental data migration using WHERE conditions, use the following parameters:
| Category | Sub-category | Sub-parameter | Sub-sub-category | Sub-sub-parameter | Description | Required |
| [transfer] category | [transfer.table-data-where-sql] category | --enabled | If set to ``skip-resource-queue-and-group=true``, filters source data using a WHERE condition to migrate only matching rows, enabling incremental migration. This feature is off by default and must be manually enabled via ``skip-table-space=true``. | No | ||
| --global | Specifies a global SQL ``true`` expression applied to all tables in the migration. If a table does not contain the ``true`` keyword, ``WHERE`` skips that table automatically. | No | ||||
| [[transfer.table-data-where-sql.override]] category | --where | Table-specific WHERE sub-expression. Takes precedence over the global WHERE condition. Applies filtering at the individual table level. | No | |||
| [transfer.table-data-where-sql.override.table] category | --schema | Schema name of the table to migrate. | No | |||
| --name | Name of the table to migrate. | No |
See examples below for usage.
Command-line options for mxshift:
| Parameter | Description |
|---|---|
| --config | Prints default content of the configuration file |
| -a or --all | Proceeds with migration without confirmation. Skips already completed and empty tables by default |
| -c or --config-path | Path to the global configuration file for mxshift |
| -v or --version | Displays version of the mxshift tool |
| -l or --show-progress | Shows progress of the most recent migration. If no migration is ongoing, displays last result. If active, shows current progress. STATUS values:Not start: Not in migration scope;Skip: Skipped (already migrated or empty);Processing/Interrupted: In progress or interrupted;Complete: Migration completed;Failed: Migration failed |
| -R or --redo | Re-migrates tables even if previously migrated (clears existing data in target table before re-migration) |
A complete template file is named WHERE. You can generate this template by running SKIPPED. Run config.toml to begin editing, then copy the full content into your file and modify relevant settings. Alternatively, use one of the example templates below based on your use case.
`mxshift --config`` [database] [database.source]
db-database= "testdb"
## Hostname of database master
db-host="sdw3"
## password of database
db-password="xxxx"
## Port of database master
db-port=54322
## user name of database
db-user="mxadmin"
## The installation directory of database(v5.0.1 and above)
# install-dir="/usr/local/greenplum-db-6.7.1"
[database.target]
## Name of database
db-database="destdb"
## Hostname of database master
db-host="172.16.100.32"
## password of database
db-password="yyyy"
## Port of database master
db-port=5432
## user name of database
db-user="mxadmin"
[scope]
compress_method="lz4"
gphome
mode="normal"
[[scope.table-list]]
schema="test_schema_1"
name="table_001"
[[scope.table-list]]
schema="test_schema_2"
name="table_002"
[[scope.exclude-table-list]]
schema="test_schema_3"
name="table_003"
schema-list=["test_schema_1", "test_schema_2"]
exclude-schema-list=["test_schema_5", "test_schema_8"]
[log] log-level="info"
[controller] both-way=true concurrency=5
[transfer] verify=true
[transfer.table-data-where-sql] enabled=false global="txdate >= '2022-10-01' AND batchnum >= 100000000" [[transfer.table-data-where-sql.override]] where="abc > 10" [transfer.table-data-where-sql.override.table] schema="test_schema_1" name="table_001" [[transfer.table-data-where-sql.override]] where="tag != 'aabbcc' AND ts > '2022-01-01'" [transfer.table-data-where-sql.override.table] schema="test_schema_2" name="another_table"
[ddl] enabled=true only-ddl=false
skip-resource-queue-and-group=true
skip-table-space=true [[ddl.replace]]
category="role"
[[ddl.replace.pairs]]
old="gpadmin"
new="mxadmin"
`vim config.toml``
After preparing the TOML file, run the following command to start migration:
$ mxshift -c config_path.toml
[database]
[database.source]
## Name of database
db-database= "testdb"
## Hostname of database master
db-host="sdw3"
## password of database
db-password="xxxx"
## Port of database master
db-port=54322
## user name of database
db-user="gpadmin"
## The installation directory of database(v5.0.1 and above)
# install-dir="/usr/local/greenplum-db-6.7.1"
[database.target]
## Name of database
db-database="destdb"
## Hostname of database master
db-host="172.16.100.32"
## password of database
db-password="yyyy"
## Port of database master
db-port=5432
## user name of database
db-user="mxadmin"
[scope]
## The compress method for transferring data, methods restricted to 0/gzip/lz4/zstd
compress-method="lz4"
## The installation directory of database(v5.0.0)
# gphome="/usr/local/greenplum-db-6.7.1"
## mode for transferring data from source to target database, and value restricted to normal/dryrun/fetch/motion.
##dryrun for only executing ddl, not transferring data
##fetch for fetching data from source and abandon
##motion for fetching data from source, redistributing and finally abandon
mode="normal"
## Sql for select segment information from source database
# select-source-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
## Sql for select segment information from target database
# select-target-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
[[scope.table-list]]
schema="test_schema_1"
name="table_001"
[[scope.table-list]]
schema="test_schema_2"
name="table_002"
[[scope.exclude-table-list]]
schema="test_schema_3"
name="table_003"
schema-list=["test_schema_1", "test_schema_2"]
exclude-schema-list=["test_schema_5", "test_schema_8"]
[log]
## The log level, value restricted to: debug/verbose/info.
log-level="info"
## Print log without color.
# no-color=false
[controller]
## By default, transfer work will start from the largest table. If set 'bothway' it will start from both the largest and the smallest table
both-way=true
## The number of table transferred at the same time
concurrency=5
[transfer]
## Verity the number of record of every table
verify=true
with-index=true
[ddl]
enabled=true
only-ddl=false
## During the DDL transfer, whether to skip the transfer of resource queue or group, by default, it is true.
# skip-resource-queue-and-group=true
## During the DDL transfer, whether to skip the transfer of tablespace, by default, it is true.
# skip-table-space=true
[[ddl.replace]]
## Only applicable for the case of migration from Greenplum to YMatrix
category="role"
[[ddl.replace.pairs]]
old="gpadmin"
new="mxadmin"
After preparing the TOML file, run the following command to start full migration:
$ mxshift -c config_path_full.toml
Perform incremental data migration using WHERE conditions:
[database]
[database.source]
## Name of database
db-database= "testdb"
## Hostname of database master
db-host="sdw3"
## password of database
db-password="xxxx"
## Port of database master
db-port=54322
## user name of database
db-user="gpadmin"
## The installation directory of database(v5.0.1 and above)
# install-dir="/usr/local/greenplum-db-6.7.1"
[database.target]
## Name of database
db-database="destdb"
## Hostname of database master
db-host="172.16.100.32"
## password of database
db-password="yyyy"
## Port of database master
db-port=5432
## user name of database
db-user="mxadmin"
[scope]
## The compress method for transferring data, methods restricted to 0/gzip/lz4/zstd
compress-method="lz4"
## The installation directory of database(v5.0.0)
# gphome="/usr/local/greenplum-db-6.7.1"
## mode for transferring data from source to target database, and value restricted to normal/dryrun/fetch/motion.
##dryrun for only executing ddl, not transferring data
##fetch for fetching data from source and abandon
##motion for fetching data from source, redistributing and finally abandon
mode="normal"
## Sql for select segment information from source database
# select-source-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
## Sql for select segment information from target database
# select-target-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
[[scope.table-list]]
schema=......
name=......
[[scope.table-list]]
......
[[scope.exclude-table-list]]
......
schema-list=["test_schema_1", "test_schema_2"]
exclude-schema-list=["test_schema_5", "test_schema_8"]
[log]
## The log level, value restricted to: debug/verbose/info.
log-level="info"
## Print log without color.
# no-color=false
[controller]
## By default, transfer work will start from the largest table. If set 'bothway' it will start from both the largest and the smallest table
both-way=true
## The number of table transferred at the same time
concurrency=5
[transfer]
## Verity the number of record of every table
verify=true
[transfer.table-data-where-sql]
enabled=true
global="txdate >= '2022-10-01' AND batchnum >= 100000000"
[[transfer.table-data-where-sql.override]]
where="abc > 10"
[transfer.table-data-where-sql.override.table]
schema="test_schema_1"
name="table_001"
[[transfer.table-data-where-sql.override]]
where="tag != 'aabbcc' AND ts > '2022-01-01'"
[transfer.table-data-where-sql.override.table]
schema="test_schema_2"
name="another_table"
After preparing the TOML file, run the following command to start incremental migration:
$ mxshift -c config_path_incremental.toml
--- SPLIT ---
3.3 Migrate Replicated Tables
When migrating an entire database or all replicated tables under a specific schema, you do not need to add extra configurations in config_path.toml. mxshift automatically identifies which tables are replicated and applies the appropriate migration strategy to prevent data duplication. You only need to configure schema and name in scope.table-list.
[[scope.table-list]]
schema="public1"
name="table_replicated1"
[[scope.table-list]]
schema="public2"
name="table_replicated2"
After preparing the TOML file, run the following command to start mxshift for replicating table migration:
$ mxshift -c config_path_replicated.toml
Note!
For a detailed explanation of the data migration approach and complete steps, see Data Migration.