Single-Table Migration Tool - mxgate

Starting from version 4.3, MatrixGate supports migration mode, enabling high-speed data migration from Greenplum 5, Greenplum 6, and YMatrix clusters into the current YMatrix cluster.

1 High-Speed Migration Principle

High-speed migration allows data to bypass the Master node. Data is read directly from the source cluster's Segments, forwarded through mxgate, and sent directly to the target cluster's Segments. This eliminates the bottleneck of a single Master node.

The number of Segments in the source and target clusters can differ.

2 Usage

2.1 Single-Table Migration

Single-table migration transfers all data from a specified table. Example usage:

mxgate --source transfer \
         --src-host 172.31.41.7 \
         --src-port 5432 \
         --src-db postgres \
         --src-user ec2-user \
         --src-password abc \
         --src-schema public \
         --src-table trans_ao \
         --compress "gzip" \
         --port-base 9129 \
         --local-ip 172.31.33.128 \
         --db-database ttt \
         --target public.trans_ao \
         --format text \
         --time-format raw \
         --use-auto-increment=false

Parameters:

Parameter Description
--source Functional entry point; must be set to 'transfer'
--src-host IP address of the source cluster's master
--src-port Port number of the source cluster's master
--src-user Username for connecting to the source database (superuser is recommended)
--src-password Connection password
--src-schema Schema name of the source table
--src-table Name of the source table
--compress Data transfer method from source segment hosts to this machine:
Empty string "": no compression, plain text transfer
gzip: use gzip compression; requires the gzip Linux command installed on source segment hosts
lz4: use lz4 compression; requires the lz4 Linux command installed on source segment hosts
Recommended order: lz4 > gzip > no compression
--port-base Range of ports used during transfer starts from 9129
--local-ip Must be an IP address reachable from the source cluster
--db-database Target database name where the destination table resides
--target Target table name in the format <schema>.<table>; if schema is omitted, defaults to public
--format text or csv. Use csv only when data contains complex strings (e.g., newlines, quotes, delimiters). Otherwise, prefer text
--time-format Must be raw in transfer mode
--use-auto-increment By default, mxgate skips serial auto-increment columns in the target table. Use this option to disable that behavior

2.2 Export to File

Another use of migration mode is to quickly export data to files:

mxgate --source transfer \
         --src-host 172.31.41.7 \
         --src-port 5432 \
         --src-db postgres \
         --src-user ec2-user \
         --src-schema public \
         --src-table trans_ao_1 \
         --compress "lz4" \
         --port-base 9129 \
         --local-ip 172.31.33.128 \
         --save-to-dir /tmp/receive/ \
         --db-database ttt \
         --transform nil \
         --writer nil \
         --target trans_ao

Use the --save-to-dir parameter to specify the file storage path.

Note: Even when exporting to files, you must provide --db-database and --target to specify the target database and table, and the target database and table must exist.

2.3 Filtered Migration

Filtered migration uses the --src-sql parameter to specify a SQL query that filters the data to be migrated. This works for both table-to-table and table-to-file migrations:

mxgate --source transfer \
         --src-host 172.31.41.7 \
         --src-port 5432 \
         --src-db postgres \
         --src-user ec2-user \
         --src-sql "select * from demo where c1 = 'xxxx'" \
         --compress "lz4" \
         --port-base 9129 \
         --local-ip 172.31.33.128 \
         --save-to-dir /tmp/receive/ \
         --db-database ttt \
         --transform nil \
         --writer nil \
         --target trans_ao

For a complete list of mxgate migration tool parameters, see mxgate.