Alert Configuration

Grafana's alerting feature detects whether monitored metrics meet defined threshold conditions, based on configured monitoring rules. It helps notify users and facilitates timely identification and resolution of issues, ensuring your cluster remains secure and stable.

This section describes how to configure and manage alerting components. YMatrix supports alerting for the following metrics:

  • Database connectivity status
  • Instance failure in the cluster
  • Data directory disk usage exceeds a specified threshold
  • Disk read/write I/O exceeds a specified threshold

Four alert notification methods are available. You may choose one or combine multiple methods as needed:

  • SMS alerts
  • Voice call alerts
  • Email alerts
  • DingTalk alerts

To use alerting, you must first deploy and enable monitoring. For detailed steps, see: Monitor Deployment

After deployment, you will see an interface similar to the following:
Default Monitoring Alert Dashboard

1 Deployment

Before importing predefined alert dashboards, you must copy the alert.json file from the server to your local machine. Similar to importing dashboard.json and database.json in Grafana, the process involves locating the file on the server, copying it, and uploading it locally. Follow these steps:

Log in to the server and switch to the mxadmin user. Locate the alert.json file at the path shown in the image. You can use the cd command, find command, or any other method you prefer.

[mxadmin@mdw ~]$ cd /opt/ymatrix/matrixdb5/share/doc/postgresql/extension

##or

[mxadmin@mdw ~]$ find /opt/ymatrix/matrixdb5/share/doc/postgresql/extension -name alert.json

Next, use the scp command to copy the file to your local machine. Permission issues may arise, so consider first copying the file to the shared /tmp/ directory, then transferring it from /tmp/ to your local system.

Note!
When copying from /tmp/, ensure you switch users appropriately to avoid permission issues.

[mxadmin@mdw]$ scp mxadmin@<server IP address>:/opt/ymatrix/matrixdb5/share/doc/postgresql/extension/"alert.json" mxadmin@<server IP address>:/tmp/

~ scp waq@<server IP address>:/tmp/"alert.json" /Users/akkepler/workplace/Grafana

Finally, open your local folder or use a command-line tool to verify the file was copied successfully. After confirmation, batch-replace all instances of the variable $host in the local alert.json file with actual hostnames. For example, if your cluster consists of a master and two segments named mdw, sdw1, and sdw2, replace every $host with 'mdw', 'sdw1', 'sdw2'. Once modified, import the file into the Grafana interface.

Note!
The dashboard.json and database.json files uploaded to your Grafana interface must also have their variables modified, and they must be imported before alert.json. In dashboard.json, replace all ${cluster} with local and $host with actual hostnames. In database.json, only replace ${cluster} with local.

Select the alert.json file located at /opt/ymatrix/matrixdb5/share/doc/postgresql/extension.

The alert dashboard functions the same as the monitoring dashboard. You can switch between them using the dropdown menu located in the upper-right corner of each panel.

2 Configuration Guide

2.1 Configure Notification Channels

First, configure notification channels. Access the configuration page as shown:

Default Monitoring Alert Panel - Notification Channels

2.1.1 SMS Alerts

YMatrix provides built-in SMS alerting via Alibaba Cloud SMS service, which is recommended. When an alert is triggered, SMS messages are sent to configured phone numbers. Bulk sending is supported. Before use, apply for and configure Alibaba Cloud SMS service. For details, refer to:

Alibaba Cloud Documentation: Quick Start for Domestic SMS

Alternatively, you can implement custom SMS alerts using Webhooks. This requires programming the Webhook logic. For more information, see Grafana Alert Documentation: Webhook Notifications

2.1.1.1 Create Configuration File

Create an alert.yaml file under /etc/matrixdb/:

# aliyun service config
aliyun_service:
  access_key_id: "your access_key_id"
  access_key_secret: " access_key_secret"
  signature: "signature"
  sms_template_code: "SMS_123445678"

Here:

  • access_key_id and access_key_secret: Provided by Alibaba Cloud after service activation.
  • signature: The approved signature text registered in the Alibaba Cloud SMS console.
  • sms_template_code: Template ID starting with "SMS_", registered and approved in the Alibaba Cloud SMS console.
  • The alert message variable in the template must be ${name}.

Example template:

Dear administrator, your database system has triggered a ${name} alert. Please log in to view details and take action promptly.
2.1.1.2 Configure WebHook in Grafana

As shown in the image:
Notification Channel Config - Alibaba Cloud SMS

Name the channel following this convention:

Aliyun Batch Short Message - For SuperAdministrators

"SuperAdministrators" represents a group of recipients. To define another group, create a separate rule. Note that the phoneNumbers parameter is case-sensitive. Multiple numbers can be specified, separated by commas:

http://:/api/alert/batch-sms?phoneNumbers=18311111111,13811111111

2.1.2 Voice Call Alerts

In addition to SMS, YMatrix supports voice call alerts via Alibaba Cloud's voice service. When triggered, calls are made to configured numbers. Bulk calling is supported. Before use, apply for and configure Alibaba Cloud Voice Service. For details, see: Alibaba Cloud Documentation: Quick Start for Domestic Voice Service

Custom voice alerts via Webhook are possible but require custom logic implementation. Refer to Grafana Alert Documentation: Webhook Notifications for details.

2.1.2.1 Edit Configuration File

Add tts_template_code and region_id to the alert.yaml file under /etc/matrixdb/:

# aliyun service config
aliyun_service:
  access_key_id: "your access_key_id"
  access_key_secret: " access_key_secret"
  signature: "signature"
  sms_template_code: "SMS_123445678"
  tts_template_code: "TTS_123456788"
  region_id: "cn-hangzhou"
  • access_key_id and access_key_secret: Shared with SMS service.
  • tts_template_code: TTS template ID starting with "TTS_", registered and approved in the Alibaba Cloud TTS console.
  • The alert message variable must be ${name}.

Example:

Dear administrator, your database system has triggered a ${name} alert. Please log in to view details and take action promptly.
2.1.2.2 Configure WebHook in Grafana

Notification Channel Config - Alibaba Cloud Voice

Use a channel name following this format:

Aliyun Voice Message - For SuperAdministrators

"SuperAdministrators" refers to the recipient group. Define additional rules for other groups. The phoneNumbers parameter is case-sensitive and supports multiple comma-separated numbers:

http://:/api/alert/vms?phoneNumbers=18311111111,13811111111

2.1.3 Email Alerts

Grafana includes built-in email alerting, which is simple to configure and use.

2.1.3.1 Create Configuration File

Email alerts use Grafana's native SMTP support. Configure the SMTP server in Grafana's main configuration file. The file location varies by deployment. On CentOS 7, the default path is:

/etc/grafana/grafana.ini

For more information, see: Grafana Official Documentation: Configuration

A sample configuration is provided below:

#################################### SMTP / Emailing #####################
[smtp]
enabled = true
host = <your smtp host>
user = <your user>
# If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
password = <your password>
;cert_file =
;key_file =
skip_verify = true
from_address = <your email address>
from_name = Grafana
;ehlo_identity =
;startTLS_policy =

[emails]
welcome_email_on_sign_up = false
templates_pattern = emails/*.html
2.1.3.2 Configure Email Notification Channel in Grafana

As shown:
Notification Channel Config - Email

Name the channel as:

Email - For SuperAdministrators

"SuperAdministrators" represents the recipient group. Define additional rules for other groups.

Note!
If configuration fails, check /var/log/grafana/grafana.log for error details.

2.1.4 DingTalk Alerts

YMatrix supports DingTalk alerts. When triggered, a pre-configured DingTalk robot sends a message to the group. To use this feature, configure both DingTalk and Grafana.

2.1.4.1 Configure DingTalk

As shown:
Create a DingTalk group and add one or more alert robots.

Notification Channel Config - DingTalk
Notification Channel Config - DingTalk
Notification Channel Config - DingTalk

Click "Add Robot", then select "Custom". You can customize the robot name, set alert keywords, and save the webhook URL.

Notification Channel Config - DingTalk
Notification Channel Config - DingTalk
Notification Channel Config - DingTalk
Notification Channel Config - DingTalk

2.1.4.2 Create Alert Rule in Grafana

In Grafana, create a notification rule and fill in the required fields.

Notification Channel Config - DingTalk
Notification Channel Config - DingTalk

After configuration, click "Test" to verify connectivity between Grafana and DingTalk. If successful, clicking the alert in DingTalk should redirect to the Grafana dashboard. If you see "localhost refused to connect", replace localhost with your actual IP address.

Notification Channel Config - DingTalk
Notification Channel Config - DingTalk

2.2 Configure Alerts

After setting up notification channels, configure alert rules for each monitoring panel.

For more information on alert configuration, see: Grafana Official Documentation: Alert

Hover over a panel title. A dropdown arrow appears on the right. Click "Edit" to enter edit mode.
Edit

Switch from "Query" to the "Alert" tab below "Edit Panel". If a popup appears saying "Notifier with invalid ID is detected", do not click "Delete", as this may remove the entire panel. Click "Cancel" and close the window instead.
Cancel

Scroll down to the "Notifications" section. Enter a concise alert message (4–6 words recommended). This message appears in emails and replaces the ${name} variable in SMS and voice templates. Keep it short due to message length limits. Select one or more notification channels (configured earlier) under "Send to".

Alert-Send-to

2.2.1 Connection Alert

Time-series databases aggregate data over time intervals. "No Data" may occur if no data is collected during the interval—e.g., due to time skew (Grafana time ahead of system time by 5 minutes) or data collection failure. "No Data" conditions are handled via alerting.

Connection Alert Configuration

2.2.2 Instance Status Alert

As shown:
Instance Alert Configuration

2.2.3 Disk Space Alert

The red line in the "Used Disk Space" graph indicates the alert threshold. The shaded red area exceeds this threshold and indicates a critical condition. The default threshold is 85%, but you can customize the "IS ABOVE" value.

Disk Space Alert Configuration

2.2.4 Disk I/O Read/Write Alert

Disk I/O read and write alerts (Disk IO Reading, Disk IO Writing) are disabled by default, allowing you to define custom thresholds.

Note!
Click Save after making changes to apply your alert settings.

Disk I/O Read Alert
Disk I/O Write Alert

FAQ

  1. If you modify the alert.yaml configuration file, run the following commands to apply changes:
source /opt/ymatrix/matrixdb5/greenplum_path.sh
supervisorctl stop mxui
supervisorctl start mxui