简体中文

High Availability FAQ

This document describes the frequently asked questions about the high availability of YMatrix 5.X.

For high availability related documents, please see:
- YMatrix Architecture
- Fault Recovery

1 Is etcd deployed on each host?

The storage of etcd can be understood as a fully replicated table, so the number of nodes does not have to be too many. At the practical level, the recommended configuration is 1, 3, 5, and 7 nodes. The installation and deployment process will automatically select and deploy several etcd instances according to the number of hosts.
Therefore, if the number of hosts in the cluster is even, or more than 7 hosts, then some machines do not have etcd nodes. You can use the ps command to check whether there are etcd processes on the current host.
Usually, etcd nodes are placed on the hosts of Master and Standby.

2 Why does the number of members of the etcd cluster need to be odd?

In etcd, setting the number of nodes to odd numbers is to ensure the stability and consistency of the election process. In the Raft protocol, the election of the leader is based on most principles. When the number of nodes is odd, it is easier for the election process to reach consensus. In this case, more than half of the nodes need to agree and vote for a candidate to become the leader. Suppose there are 5 nodes, at least 3 nodes need to agree to elect the same leader; if there are 7 nodes, at least 4 nodes need to agree.

This configuration not only ensures that the leader's election results are unique, but also ensures that the election time is as short as possible. In addition, odd nodes can also provide better fault tolerance. In the event of a failure or network abnormality, most nodes can still maintain the election process to ensure system availability and consistency. When the number of nodes is even, a draw may occur, resulting in the election being unable to be completed or the result is uncertain.

3 What operation and maintenance operations do we need to do for etcd in daily life?

First, you need to deploy monitoring for etcd. Currently, it supports monitoring (Prometheus + Grafana) for deployment of etcd in Prometheus 5.0.

4 How much data is the etcd? Is there any special operation and maintenance work required?

5.X's etcd will automatically clean up regularly, so the data directory and memory size can be maintained in a relatively fixed range.
However, if there is a node failure and subsequent recovery operation, the data of etcd will slightly bloat in a short period of time. As long as the etcd data directory does not continue to swell more than 1.5GB, it is normal. It is recommended to monitor and regularly check it through monitoring.

5 After the introduction of etcd, what changes have occurred in the database cluster operation of the graphical interface deployment?

There has been no change in user experience, and the pages and operation methods of installation and deployment are exactly the same as before.

6 5.X claims to have implemented Master Auto-failover. Why does Master not automatically recover after shutting down and switching on the Master?

There is a conceptual misunderstanding here, and a Segment has two dimensions:

One dimension is role: Primary or Mirror
One dimension is state: Up or Down

In the current version, the Master Auto-failover function refers to the automatic switching of node roles. That is, after Master is offline, Standby can automatically switch to Master. This action is called Promote or Failover.
Once the state of a Segment changes from Up to Down, it must be changed back to Up through a manual node recovery operation (mxrecover).

7 How long does it take for the master to automatically switch to occur?

When the Master's postmaster process crashes, but the host and network are both normal, Master can quickly switch to Standby.
When the Master host is powered off, network isolation, etc., its automatic switching will delay retry for a period of time according to the configuration of the parameters.

8 Is it normal for the cluster to crash after more than half of the etcd processes are abnormal (killed or unable to start)?

Yes.

The survival of the postgres process requires the survival of the lease of the etcd cluster. When an exception occurs in the etcd service itself (more than half of the etcd nodes are down or lost contact), the postgres process cannot maintain survival, and the downtime is an inevitable result. Therefore, please strictly deploy etcd monitoring and pay close attention to its health status. Monitoring matters include but are not limited to: disk remaining space, disk I/O, network connectivity, the survival of the host process supervisor, etc.