Quick onboard
Deployment
Data Modeling
Connecting
Migration
Query
Operations and Maintenance
Common Maintenance
Partition
Backup and Restore
Expansion
Mirroring
Resource Management
Security
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool guide
Data type
Storage Engine
Executor
Stream
DR (Disaster Recovery)
Configuration
Index
Extension
SQL Reference
This document describes the graphical interface cluster health monitoring feature.
While supporting daily operations, YMatrix databases execute a large volume of SQL statements. Issues such as hardware failures (e.g., network outages) or lock contention due to transaction concurrency may occur. If not addressed promptly, these issues can slow down client responses or cause direct errors, affecting business efficiency. To better address such problems, the graphical health monitoring feature helps you quickly identify abnormal behaviors in the database cluster.
Health monitoring periodically checks relevant system catalog tables based on different detection items. It evaluates whether query execution states meet expected business conditions. When an unexpected state is detected, a notification is immediately sent. Notifications can be viewed within the graphical interface. For more timely alerts, you can also configure email notifications if checking the web page is inconvenient.
Enter the IP address (by default, the Master host's IP) and port number of the machine where MatrixGate is running into your browser to log in to the graphical interface.
http://<IP>:8240
After logging in successfully, navigate to the Health Monitoring page.

You may choose whether to configure email settings based on your needs. Once configured, you will receive alert notifications via email.
Graphical Interface Domain Name
To facilitate quick access to detailed alert information, we include a link in the email that redirects to the graphical interface. If recipients cannot access the default domain, modify this field accordingly.
SMTP Server Address
The SMTP server address consists of an IP address and port number. Example: smtp.example.com:465.
Common third-party email service addresses:
Alibaba Cloud Mail:
Google Mail:
Enable IMAP or POP service first; see documentation.
NetEase Mail:
QQ Mail:
Note!
If the email service is self-hosted, consult your email administrator or service provider.
Username
The account used for authentication on the SMTP server. This field is optional and required only when the SMTP server requires username-based authentication. Example: [email protected].
Password
The password for the SMTP user. This field is optional and required only when the SMTP server requires both username and password for authentication.
Note!
For self-hosted email services, consult your email administrator or service provider.
Sender
For third-party email services, this field should match the "Username".
For self-hosted services, enter the sender email address.
Recipients
Enter one or more recipient email addresses.

If you have configured email settings, you will receive an alert email whenever an event meets the failure condition of any detection item.
Regardless of whether email notifications are configured, you can view historical records of events that met detection failure conditions under Event History.

The following is a list of monitoring items provided by YMatrix:
| Item | Monitoring Item | Description |
| 1 | Cluster Unavailable | Periodically runs the query `SELECT * FROM gp_dist_random('gp_id');` to check cluster availability. If this query fails three times consecutively, the cluster is likely down—possible causes include primary Segment and its mirror Segment failing simultaneously, network failure, power outage, or hardware failure. |
| 2 | Segment Failure | A failed primary Segment causes resource skew on the corresponding mirror Segment host. The mirror Segment’s host experiences increased load, slowing queries. In severe cases, memory exhaustion on the skewed node may render the cluster unavailable. A failed mirror Segment reduces high availability. If the corresponding primary Segment then fails, the cluster becomes unavailable. |
| 3 | Query/Transaction Running Over 12 Hours | Long-running queries or transactions consume excessive memory and CPU resources, degrading database response performance and potentially triggering OOM (out-of-memory). They may also delay VACUUM processes. |
| 4 | Transaction Idle in Transaction for Over 1 Hour | A transaction remaining idle in transaction state for a long time blocks most queries involving its tables and prevents VACUUM from reclaiming dead rows, leading to table bloat. |
| 5 | A Single Query/Transaction Blocks More Than 5 Others for Over 15 Minutes | If a query or transaction blocks many others for a prolonged period, it may trigger cascading blockages among other statements, reducing service responsiveness. |
| 6 | Query Requesting Exclusive or AccessExclusive Lock Blocked for Over 15 Minutes | A query requesting an Exclusive or AccessExclusive table-level lock, if blocked for a long duration, may cause a backlog of blocked queries, affecting response efficiency. |
| 7 | Query/Transaction Holding Exclusive or AccessExclusive Lock for Over 2 Hours | A query or transaction holding an Exclusive or AccessExclusive table-level lock for a long time blocks all queries accessing the locked table, impacting service responsiveness. |
| 8 | Transaction Holding Exclusive or AccessExclusive Lock in Idle-in-Transaction State for Over 15 Minutes | A transaction holding an Exclusive or AccessExclusive lock while idle in transaction for 15 minutes blocks most queries on related tables, affecting service responsiveness. |
| 9 | Disk | You can quickly enable or disable disk monitoring options including: “Disk Full”, “Disk Space Below 20%”, “Disk Will Be Exhausted Within 7 Days”, and “Abnormal Disk Growth in Last 24 Hours”. Click the “Edit” button to adjust thresholds according to business needs. |
All items are enabled by default but can be toggled as needed.
If the default parameters do not suit your business scenario, you can edit them.

More
For Grafana alert configuration, refer to Grafana Cluster Alerts.