Command line online expansion

If you are used to using code to complete capacity expansion operations, we also provide corresponding capacity expansion tools and SQL statements to achieve flexible and smooth capacity expansion functions. Generally speaking, there are two main steps for cluster expansion: adding new nodes and redistribution of data. Adding new nodes is a very quick operation. In YMatrix, you can complete it without shutting down. Data redistribution is relatively complex. If no specific operations are performed, any newly created table will be distributed on all nodes by default. The redistributed table or partition will be locked and cannot be read and written, causing blockage. At the same time, in some business scenarios, users do not need to redistribute the data of certain tables. They only set up new distribution methods for some old tables or new tables to meet their needs.

In YMatrix, we provide the ***Smooth scaling function to achieve this requirement. By creating a SEGMENT_SET object, you can define the data node collection independently and specify that the data of the table is distributed on the object (that is, only the data is distributed in a specific data node collection).

YMatrix also supports the following functions: performing table data redistribution in parallel, connecting unredistributed tables and redistributed tables, etc.

1 System Planning

Before performing a formal expansion operation, the essential is to systematic expansion planning of the cluster. We have listed the following task list, which includes a complete expansion process.

Serial number Task content
Preparation
1 Installing the same version of YMatrix software for the new host
2 View cluster status
3 View software version
4 Add host name to system file
Expansion
5 Execute capacity expansion
After expansion inspection
6 View new cluster
7 View data distribution

After the expansion is completed, you need to pre-plan the data redistribution method before performing data redistribution. For example, time-series data partitioned by time usually have obvious hot and cold characteristics (hot and cold characteristics are a relative concept, mainly related to the frequency at which the data is queried, and high frequency means relatively hot data characteristics). As time goes by, hot data gradually turn cold, and the query demand for it also decreases. There is no need to allocate more resources to complete the query action, so redistribution of this is unnecessary. At this time, it is better to set the distribution in advance for new data to be stored (completed during table construction) and use the smooth expansion feature in YMatrix to achieve this plan. On the contrary, if the database table is usually a non-partitioned table, or if the partition data of each partitioned table is relatively uniform and has new data, it is more suitable for redistribution of all data.

If you want to redistribute all data, please refer to Table 1 Step Planning.

Table 1

Serial number Task content
Data redistribution
1 Execute all data redistribution command
Task after redistribution
2 View the current redistribution status
3 View data distribution

If you want to perform *Smooth expansion, please refer to Table 2 below. In this table, we give 4 different ways to achieve smooth expansion, Choose one of them**.

Table 2

Method Smooth expansion
(1) Create a required data node collection (CREATE SEGMENT_SET), specify the data node collection when creating a new table or partition
(2) When creating a table, directly specify the existing data node collection before expansion.
(3) The new table does not specify a data node collection. The data node collection distribution is specified according to the mx_default_segment_set parameter. The default value is global distribution.
(4) In addition to new data, some old data also need to be re-planned and distributed. At this time, the new data needs to be processed separately from the old data. The old data redistribution uses the ALTER TABLE SET SEGMENT_SET statement, and the new data uses the above (1), (2), and (3) methods, and choose one of them.

2 Preparation

Specific operation examples are as follows. Suppose you currently have three servers: mdw, sdw1, sdw2. Two new machines are expected to be added: sdw3 and sdw4. The following test table test is included in the sample cluster, containing 100 test data.

=# CREATE EXTENSION matrixts;
=# CREATE TABLE test (a int,b int) USING MARS3 DISTRIBUTED BY (a) ORDER BY (a);
=# INSERT INTO test SELECT i,i+10 FROM generate_series(1, 100) AS i;

2.1 Install the same version of YMatrix software for the new host

First, you need to install the same version of YMatrix software on the two new hosts you will add to the cluster and on the existing hosts. For specific deployment steps, please refer to [Online Cluster Deployment Section 1-3] (/doc/5.1/install/mx5_cluster/mx5_cluster).

2.2 Check cluster status

Secondly, use the gp_segment_configuration table to view the cluster status to ensure that all node instances are in a healthy state.

=# SELECT * FROM gp_segment_configuration ORDER BY 1 ASC;

dbid | content | role | preferred_role | mode | status | port | hostname | address |                datadir                
-----+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | mdw      | mdw     | /data/mxdata/master/mxseg-1
    2 |       0 | p    | p              | n    | u      | 6000 | sdw1     | sdw1    | /data/mxdata/primary/mxseg0
    3 |       1 | p    | p              | n    | u      | 6001 | sdw1     | sdw1    | /data/mxdata/primary/mxseg1
    4 |       2 | p    | p              | n    | u      | 6000 | sdw2     | sdw2    | /data/mxdata/primary/mxseg2
    5 |       3 | p    | p              | n    | u      | 6001 | sdw2     | sdw2    | /data/mxdata/primary/mxseg3
(5 rows)

2.3 View software version

Use the version() function again to confirm whether the YMatrix version is the same as the existing version.

=# SELECT version();
                                                                                                                                    version

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 12 (MatrixDB 5.0.0-enterprise) (Greenplum Database 7.0.0+dev.17976.g99b559cbeb build commit:99b559cbebca8c47dfb2de58fb10b6e2a04f91a0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5), 64-bit
compiled on Aug 16 2022 16:18:55
(1 row)

2.4 Add host name to system file

After confirming the cluster status and software version, you need to configure the /etc/hosts system file on all nodes using root permissions and add the new host name. Please keep the network open during the process.

[<普通用户名>@mdw ~]$ sudo vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.247.128 mdw
192.168.247.129 sdw1
192.168.247.130 sdw2
192.168.247.131 sdw3
192.168.247.132 sdw4

3 Execute capacity expansion

Notes!
All the following expansion operations are completed on Master.

3.1 Collect existing cluster information

$ sudo /opt/ymatrix/matrixdb5/bin/mxctl expand init > /tmp/init_output

Existing cluster information is output to the /tmp/init_output directory in the form of structured information.

3.2 Add a new host

$ cat /tmp/init_output | sudo /opt/ymatrix/matrixdb5/bin/mxctl expand add --host newhost1 > /tmp/add.1
$ cat /tmp/add.1 | sudo /opt/ymatrix/matrixdb5/bin/mxctl expand add --host newhost2 > /tmp/add.2
...

To add several hosts, you need to execute several times, and each output is used as the input for the next time.

3.3 Conduct network interoperability test

$ cat /tmp/add.3 | sudo /opt/ymatrix/matrixdb5/bin/mxctl expand netcheck > /tmp/ncheck

When the exit code exitcode (echo $?) is greater than 0, it means that the test is not passed.

3.4 Generate a scaling plan

$ cat /tmp/ncheck | sudo /opt/ymatrix/matrixdb5/bin/mxbox deployer expand > /tmp/exandplan

The directory creation path, number of segments, distribution strategy of Mirror, etc. are all information under the generated /tmp/exandplan. Advanced users can modify this file directly.

3.5 Execute the expansion plan

$ cat /tmp/exandplan | sudo /opt/ymatrix/matrixdb5/bin/mxbox deployer exec

Congratulations, the execution expansion is completed.

4 Check tasks after expansion

4.1 View new cluster

=# SELECT * FROM gp_segment_configuration ORDER BY 1;
 dbid | content | role | preferred_role | mode | status | port | hostname | address |                datadir                
-----+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | mdw      | mdw     | /mxdata_20220331173619/master/mxseg-1
    2 |       0 | p    | p              | n    | u      | 6000 | sdw1     | sdw1    | /mxdata_20220331173619/primary/mxseg0
    3 |       1 | p    | p              | n    | u      | 6001 | sdw1     | sdw1    | /mxdata_20220331173619/primary/mxseg1
    4 |       2 | p    | p              | n    | u      | 6000 | sdw2     | sdw2    | /mxdata_20220331173619/primary/mxseg2
    5 |       3 | p    | p              | n    | u      | 6001 | sdw2     | sdw2    | /mxdata_20220331173619/primary/mxseg3
    6 |       4 | p    | p              | n    | u      | 6000 | sdw3     | sdw3    | /mxdata_20220331173619/primary/mxseg4
    7 |       5 | p    | p              | n    | u      | 6001 | sdw3     | sdw3    | /mxdata_20220331173619/primary/mxseg5
    8 |       6 | p    | p              | n    | u      | 6000 | sdw4     | sdw4    | /mxdata_20220331173619/primary/mxseg6
    9 |       7 | p    | p              | n    | u      | 6001 | sdw4     | sdw4    | /mxdata_20220331173619/primary/mxseg7
(9 rows)

4.2 View data distribution

You can see the distribution status of the data of the original data table on the previous data node through the following SQL statement. It's time to plan how the data is distributed.

=#  SELECT gp_segment_id,count(*) FROM public.test GROUP BY gp_segment_id;

5 Planning data distribution

Before performing data distribution, you already know that cluster redistribution is an important and time-consuming task after cluster expansion. Before the cluster expansion, the business data is distributed on the old nodes. A new available node is added during the expansion process. At this time, the process of evenly distributing the data of the old nodes to all nodes in the current cluster is called data "redistribution". When it meets the needs of the scenario, data redistribution will greatly improve the service response rate; but when it does not meet the needs of the scenario, data redistribution will waste a lot of resources and time. Therefore, you think it is very necessary to think about and plan how to distribute data.

There are two ideas below. You have chosen one of the following ** implementations based on the characteristics of the timing scenario.

5.1 Redista all data

There is no data on the newly added data node. You use the following method to redistribute all the old data into the new cluster. You can choose the appointment execution time or execute it immediately.

Notes!
It is recommended to select a low-peak business period for data redistribution.

Execute now:

$ sudo /opt/ymatrix/matrixdb5/bin/mxctl expand redistribute

Appointment execution:

$ sudo /opt/ymatrix/matrixdb5/bin/mxctl expand redistribute --schedule "2022-07-20 22:00:00"

After making an appointment, you can re-run the above command at any time before the appointment time arrives, or you can remove the --schedule parameter and trigger the redistribution immediately.

During and after the redistribution process, you can use the system table in [Expand System Directory] (/doc/5.1/maintain/expand_system_catalogs) to view the redistribution status. And use the following statement to view the cluster distribution status again.

=# SELECT gp_segment_id,count(*) FROM public.test GROUP BY gp_segment_id;

5.2 Redista part of the data

That is what YMatrix calls * smooth expansion. As planned in the previous article, you are ready to choose one of the 4 available methods based on actual conditions.

Method Smooth expansion
(1) Create a required data node collection (CREATE SEGMENT_SET), specify the data node collection when creating a new table or partition
(2) When creating a table, directly specify the existing data node collection before expansion.
(3) The new table does not specify a data node collection. The data node collection distribution is specified according to the mx_default_segment_set parameter. The default value is global distribution.
(4) In addition to new data, some old data also need to be re-planned and distributed. At this time, the new data needs to be processed separately from the old data. The old data redistribution uses the ALTER TABLE SET SEGMENT_SET statement, and the new data uses the above (1), (2), and (3) methods, and choose one of them.

5.2.1 Create a new data node collection

Create a new collection of data nodes using the SQL statement CREATE SEGMENT_SET in YMatrix.

=# CREATE SEGMENT_SET name SEGMENTS (content_id, ...);

5.2.2 Specify the existing data node collection

Use SQL statements in YMatrix to specify the created collection of data nodes. For example:

=# CREATE TABLE t1(a int, b int) USING MARS3 DISTRIBUTED BY(a) SEGMENT_SET ss1 ORDER BY (a);
=# CREATE TABLE t1(a int, b int) USING MARS3 DISTRIBUTED REPLICATED SEGMENT_SET ss1 ORDER BY (a);
=# CREATE TABLE t1(a int, b int) USING MARS3 DISTRIBUTED RANDOMLY SEGMENT_SET ss1 ORDER BY (a);

For more usage information, see CREATE SEGMENT_SET.

5.2.3 Use default values

Create a new table in YMatrix. If you do not specify a data node collection, the data will be distributed on the data node collection specified by mx_default_segment_set, and will be distributed on all nodes of the cluster by default. You can modify the default data node collection through the SET statement, for example:

=# SET mx_default_segment_set TO 'ss1';

5.2.4 Redista part of old data

In addition to new data, some old data also need to be re-planned and distributed. At this time, the new data needs to be processed separately from the old data. The old data redistribution uses the ALTER TABLE SET SEGMENT_SET statement, and the new data uses the methods described in 5.2.1, 5.2.2, and 5.2.3 above, and choose one of them. In the example, the table t that already exists before expansion is redistributed onto the data node set ss1.

=# ALTER TABLE t SET SEGMENT_SET ss1;

Table t can also be redistributed to all nodes after expansion.

=# ALTER TABLE t SET SEGMENT_SET all_segments;

#or

=# ALTER TABLE t EXPAND TABLE;

At this point, an efficient cluster expansion implemented by the command line has been completed!