The mxshift tool currently supports the following features:
This section describes how to write the configuration file config.json required to start the mxshift tool, including its parameters and command-line options.
To use mxshift, you must first prepare a JSON configuration file named config.json. An example is shown below:
{
"mode": "normal",
"verify": true,
"bothway": true,
"concurrency": 5,
"compress_method": "lz4",
"log_level": "info",
"gphome": "/usr/local/greenplum-db-6.7.1",
"src_host": "sdw3",
"src_db": "testdb",
"src_user": "gpadmin",
"src_password": "xxxx",
"src_port": 54322,
"target_host": "172.16.100.32",
"target_db": "destdb",
"target_user": "mxadmin",
"target_password": "yyyy",
"target_port": 5432,
"where": {
"enabled": false,
"global": "txdate >= '2022-10-01' AND batchnum >= 100000000",
"override": [
{
"schema": "test_schema_1",
"name": "table_001",
"where": "abc > 10"
},
{
"schema": "test_schema_2",
"name": "another_table",
"where": "tag != 'aabbcc' AND ts > '2022-01-01'"
}
]
},
"schema_list": [
"test_schema_1",
"test_schema_2"
],
"exclude_schema_list": [
"test_schema_5",
"test_schema_8"
],
"table_list": [
{
"schema": "test_schema_1",
"name": "table_001"
},
{
"schema": "test_schema_2",
"name": "table_002"
}
],
"exclude_table_list":[
{
"schema": "test_schema_3",
"name": "table_003"
}
]
}
Below is a detailed description of each parameter in the configuration file:
| Parameter | Description |
|---|---|
mode |
Operation mode. Allowed values: normal, dryrun, fetch, motion. normal: Default value; performs normal data migration.dryrun: Checks connectivity to both databases and validates with simple DDL operations, but does not read or write data.fetch: Based on dryrun, reads data from the source and sends it to target segments, but discards it without processing. Used to test source read performance and network transfer speed.motion: Extends fetch by parsing data on the target and redistributing it across proper target segments. Used to test target-side data redistribution rate. |
verify |
Default is false. If enabled, verifies that row counts match before and after migration. |
bothway |
By default, mxshift migrates the largest tables first to complete full migration as quickly as possible. If set to true, mxshift processes both the largest and smallest tables simultaneously from both ends. |
concurrency |
Maximum number of tables migrated in parallel. Default is 5; maximum allowed is 10. |
compress_method |
Compression method: gzip, zstd, lz4, or 0 (no compression). Default is 0. When any compression is enabled, the mxadmin user on each target Segment must be able to SSH passwordlessly to each source Segment. It is recommended to test this setup using dryrun. |
deduplicate (Experimental) |
If a previous mxshift migration was interrupted, enabling this option allows mxshift to resume and complete only the remaining tables, skipping those already successfully migrated. This feature is experimental. |
log_level |
Logging level. Default is info, which outputs progress, error, and warning messages. Set to debug for more detailed logs, useful for development and troubleshooting. |
gphome |
Absolute path to the GPHOME directory on the source database. |
src_host |
Hostname or IP address of the Master node of the source GP cluster. |
src_db |
Source GP database from which data will be migrated. |
src_user |
Username for connecting to the source GP database. |
src_password |
Password for the source GP database user. |
src_port |
Port number of the source GP database. |
target_host |
Hostname or IP address of the Master node of the target MatrixDB cluster. |
target_db |
Target MatrixDB database to which data will be migrated. |
target_user |
Username for connecting to the target MatrixDB database. |
target_password |
Password for the target MatrixDB database user. |
target_port |
Port number of the target MatrixDB database. |
table_list |
List of source GP tables to migrate. For partitioned tables, specify the parent partitioned table. To migrate an entire database, leave this empty and configure schema_list instead. mxshift will automatically migrate all non-empty tables in descending order of size. |
exclude_table_list |
List of schema and table names to exclude from migration. |
schema_list |
List of schemas in the source GP database to include. Priority order: if table_list is specified, only those tables are migrated; otherwise, if schema_list is defined, all tables under the listed schemas are migrated; if neither is defined, all tables in the database are considered. |
exclude_schema_list |
List of schemas to exclude from migration. Tables under these schemas will not be migrated. If a schema appears in both schema_list and exclude_schema_list, it will be excluded. |
no-color |
Disables colored output in logs. |
If you want to perform incremental migration using WHERE conditions, use the following additional parameters:
| Parameter | Description |
|---|---|
enabled |
Disabled by default. Set to true to enable WHERE-based filtering. |
global |
A SQL WHERE expression applied to all tables in the migration. If a table does not contain the columns referenced in the WHERE clause, the filter is skipped for that table, and mxshift will mark it as SKIPPED. |
override |
An array of objects, each containing schema, name, and a custom where condition. These override the global WHERE condition for specific tables. |
The mxshift command-line options are as follows:
| Option | Description |
|---|---|
-c or --config_path |
Path to the mxshift configuration file (e.g., config.json). |
-n or --no-color |
Disable colored log output. |
-v or --version |
Display the version of the mxshift tool. |
Example configuration file config.json for full migration:
{
"mode": "normal",
"verify": false,
"bothway": true,
"concurrency": 5,
"compress_method": "",
"log_level": "info",
"gphome": "/usr/local/greenplum-db-6.7.1",
"src_host": "sdw3",
"src_db": "testdb",
"src_user": "gpadmin",
"src_password": "greenplum@2020",
"src_port": 54322,
"target_host": "172.16.100.32",
"target_db": "ms",
"target_user": "mxadmin",
"target_password": "xxxx",
"target_port": 5432,
"schema_list":["test_schema_1", "test_schema_2"],
"exclude_schema_list":["test_schema_5", "test_schema_8"],
"table_list":[
{
"schema":"test_schema_1",
"name":"table_001"
},
{
"schema":"test_schema_2",
"name":"table_002"
}
],
"exclude_table_list":[
{
"name":"test_schema_3",
"schema":"table_003"
}
]
}
After preparing the JSON file, run the following command to start the full migration:
mxshift -c config_path.json
Use WHERE conditions to perform incremental migration:
{
"mode": "normal",
"verify": false,
"bothway": true,
"concurrency": 5,
"compress_method": "",
"log_level": "info",
"gphome": "/usr/local/greenplum-db-6.7.1",
"src_host": "sdw3",
"src_db": "testdb",
"src_user": "gpadmin",
"src_password": "greenplum@2020",
"src_port": 54322,
"target_host": "172.16.100.32",
"target_db": "ms",
"target_user": "mxadmin",
"target_password": "xxxx",
"target_port": 5432,
"where": {
"enabled": true,
"global": "txdate >= '2022-10-01' AND batchnum >= 100000000",
"override": [
{
"schema":"test_schema_1",
"name":"table_001",
"where": "abc > 10"
}
]
},
"schema_list":["test_schema_1", "test_schema_2"],
"exclude_schema_list":["test_schema_5", "test_schema_8"],
"table_list":[
...
],
"exclude_table_list":[
...
]
}
After preparing the JSON file, run the following command to start the incremental migration:
mxshift -c config_path.json
Note: Ensure
"enabled"is set totrueto activate the WHERE filter.
mxshift automatically identifies replicated tables and applies appropriate migration logic to prevent data duplication.table_list is a replicated table, you must explicitly declare it by adding the policyType field with value "r".Example configuration snippet:
"table_list":[
{
"schema":"public",
"name":"tbl_test_replica",
"policyType":"r"
}
],
After preparing the JSON file, run the following command to start the replicated table migration:
mxshift -c config_path.json
Note!
For a complete overview of data migration strategies and procedures, see Data Migration.