For distributed databases with massive storage, as the amount of data increases, insufficient cluster capacity and computing power is inevitable, so capacity expansion operations are indispensable. MatrixDB can achieve online expansion, including:
Online capacity expansion is divided into two steps:
This link adds a new segment node to the cluster and starts to use it, but the original table data is still stored on the old node.
To add a new node, you must first provide a deployment plan configuration file. The file can be written manually or automatically generated using the gpexpand
tool. It is recommended to use the automatic generation method.
Execute gpexpand
:
[mxadmin@mdw ~]$ gpexpand
......
Please refer to the Admin Guide for more information.
Would you like to initiate a new System Expansion Yy|Nn (default=N):
Enter y
to enter:
> y
Then ask to enter the new node host name and split it with comma:
Enter a comma separated list of new hosts you want
to add to your array. Do not include interface hostnames.
**Enter a blank line to only add segments to existing hosts**[]:
> sdw3,sdw4
Confirm that the new host name is connected to the network and that mutual trust has been established with the original node. And make sure
rsync
is installed
How many additional machines are added to each node? The value entered here is how many segments are added based on the number of segments deployed by the original node. If the number of nodes deployed by each host is the same as the original host, just enter 0 directly.
How many new primary segments per host do you want to add? (default=0):
> 0
Generating configuration file...
20211102:14:36:24:024562 gpexpand:mdw:mxadmin-[INFO]:-Generating input file...
Input configuration file was written to 'gpexpand_inputfile_20211102_143624'.
Please review the file and make sure that it is correct then re-run
with: gpexpand -i gpexpand_inputfile_20211102_143624
20211102:14:36:24:024562 gpexpand:mdw:mxadmin-[INFO]:-Exiting...
After the execution, the gpexpand_inputfile_20211102_143624
configuration file will be generated in the current directory:
[mxadmin@mdw ~]$ cat gpexpand_inputfile_20211102_143624
sdw3|sdw3|7002|/home/mxadmin/gpdemo/datadirs/dbfast1/demoDataDir3|5|3|p
sdw3|sdw3|7003|/home/mxadmin/gpdemo/datadirs/dbfast2/demoDataDir4|6|4|p
sdw3|sdw3|7004|/home/mxadmin/gpdemo/datadirs/dbfast3/demoDataDir5|7|5|p
sdw4|sdw4|7002|/home/mxadmin/gpdemo/datadirs/dbfast1/demoDataDir6|8|6|p
sdw4|sdw4|7003|/home/mxadmin/gpdemo/datadirs/dbfast2/demoDataDir7|9|7|p
sdw4|sdw4|7004|/home/mxadmin/gpdemo/datadirs/dbfast3/demoDataDir8|10|8|p
The configuration file contains new host information, as well as new segment port number, data directory, dbid, role and other information. This information corresponds to the gp_segment_configuration
table. The configuration file can also be written manually in this format.
After preparing the configuration file, perform the following operation to add a new node:
[mxadmin@mdw ~]$ gpexpand -i gpexpand_inputfile_20211102_143624
......
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Unlocking catalog
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Unlocked catalog
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Creating expansion schema
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Populating gpexpand.status_detail with data from database postgres
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Populating gpexpand.status_detail with data from database template1
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Populating gpexpand.status_detail with data from database mxadmin
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-************************************************
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Initialization of the system expansion complete.
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-To begin table expansion onto the new segments
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-rerun gpexpand
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-************************************************
20211102:14:54:03:025371 gpexpand:mdw:mxadmin-[INFO]:-Exiting...
After successful execution, query the gp_segment_configuration
table and you will see that the new node has been joined.
The second step of online expansion is to redistribute the table data to the new node, because there is no data on the newly added segment node.
MatrixDB supports three distribution methods: random, hash, and replication, and they must be synchronized to the new segment node according to the rules. Among them, the replicated table must still be stored in the newly added segment node; the hash table must be redistributed according to the rules; the random table will not be migrated because of the irregular data distribution, but the newly inserted data will fall on the new segment node according to the probability.
After the first phase is completed, a schema: gpexpand
will be created in the postgres
database, including the expansion status information:
postgres=# \d
List of relations
Schema | Name | Type | Owner | Storage
----------+-----------------------------------------------------------------------------------------------------------------------
gpexpand | expansion_progress | view | mxadmin |
gpexpand | status | table | mxadmin | heap
gpexpand | status_detail | table | mxadmin | heap
(3 rows)
Among them, the status_detail
table records all tables that need to be redistributed and redistributed status:
postgres=# select * from gpexpand.status_detail ;
dbname | fq_name | table_oid | root_partition_name | rank | external_writable | status | expansion_started | expansion_finished | source_bytes
---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
mxadmin | public.t2 | 16388 | | 2 | f | NOT STARTED | | | 8192
mxadmin | public.t1 | 16385 | | 2 | f | NOT STARTED | | | 16384
mxadmin | public.disk | 16391 | public.disk | 2 | f | NOT STARTED | | | 0
(3 rows)
Run gpexpand
directly to perform table redistribution:
[mxadmin@mdw ~]$ gpexpand
......
gpexpand:mdw:mxadmin-[INFO]:-Querying gpexpand schema for current expansion state
20211102:15:18:27:026291 gpexpand:mdw:mxadmin-[INFO]:-Expanding mxadmin.public.t2
20211102:15:18:27:026291 gpexpand:mdw:mxadmin-[INFO]:-Finished expanding mxadmin.public.t2
20211102:15:18:27:026291 gpexpand:mdw:mxadmin-[INFO]:-Expanding mxadmin.public.t1
20211102:15:18:27:026291 gpexpand:mdw:mxadmin-[INFO]:-Finished expanding mxadmin.public.t1
20211102:15:18:27:026291 gpexpand:mdw:mxadmin-[INFO]:-Expanding mxadmin.public.disk
20211102:15:18:27:026291 gpexpand:mdw:mxadmin-[INFO]:-Finished expanding mxadmin.public.disk
20211102:15:18:32:026291 gpexpand:mdw:mxadmin-[INFO]:-EXPANSION COMPLETED SUCCESSFULLY
20211102:15:18:32:026291 gpexpand:mdw:mxadmin-[INFO]:-Exiting...
After the table redistribution is completed, query the status_detail
table and you will see that the status has become COMPLETED
:
postgres=# select * from gpexpand.status_detail ;
dbname | fq_name | table_oid | root_partition_name | rank | external_writable | status | expansion_started | expansion_finished | source_bytes
---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
mxadmin | public.t2 | 16388 | | 2 | f | COMPLETED | 2021-11-02 15:18:27.326247 | 2021-11-02 15:18:27.408379 | 8192
mxadmin | public.t1 | 16385 | | 2 | f | COMPLETED | 2021-11-02 15:18:27.431481 | 2021-11-02 15:18:27.507591 | 16384
mxadmin | public.disk | 16391 | public.disk | 2 | f | COMPLETED | 2021-11-02 15:18:27.531727 | 2021-11-02 15:18:27.570559 | 0
(3 rows)
If you want to increase the redistribution concurrency, you can add the -B
parameter when executing the gpexpand
command. The default value is 16 and the upper limit is 128, such as:
[mxadmin@mdw ~]$ gpexpand -B 32
After the redistribution is complete, run gpexpand -c
to clean up the intermediate table of the expansion process:
[mxadmin@mdw ~]$ gpexpand -c
20211102:15:24:41:026524 gpexpand:mdw:mxadmin-[INFO]:-local Greenplum Version: 'postgres (MatrixDB) 5.0.0-enterprise~alpha (Greenplum Database) 7.0.0 build dev'
20211102:15:24:41:026524 gpexpand:mdw:mxadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 12 (MatrixDB 5.0.0-enterprise~alpha) (Greenplum Database 7.0.0 build dev) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5), 64-bit compiled on Oct 25 2021 15:24:16'
20211102:15:24:41:026524 gpexpand:mdw:mxadmin-[INFO]:-Querying gpexpand schema for current expansion state
Do you want to dump the gpexpand.status_detail table to file? Yy|Nn (default=Y):
> n
20211102:15:24:46:026524 gpexpand:mdw:mxadmin-[INFO]:-Removing gpexpand schema
20211102:15:24:46:026524 gpexpand:mdw:mxadmin-[INFO]:-Cleanup Finished. exiting...
At this point, the expansion was successful.
For detailed usage methods of gpexpand, please refer to Document