This document describes how to use resource groups to manage transaction concurrency, CPU, and memory resource allocation in YMatrix. After defining a resource group, you can assign it to one or more database roles to control the resources available to those roles.
YMatrix uses Linux-based control groups (cgroups) to manage database resources and employs Runaway detection for memory accounting, tracking, and management.
The following table lists configurable resource group parameters:
| Parameter | Description | Range | Default |
|---|---|---|---|
| CONCURRENCY | Maximum number of concurrent transactions allowed in the resource group, including active and idle transactions. | [0 - max_connections] | 20 |
| CPU_MAX_PERCENT | Maximum percentage of CPU resources the resource group can use. | [1 - 100] | -1 (disabled) |
| CPU_WEIGHT | Scheduling priority for the resource group. | [1 - 500] | 100 |
| CPUSET | Specific CPU logical cores (or logical threads in hyper-threading) reserved for this resource group. | System-dependent | -1 |
| MEMORY_QUOTA | Memory limit assigned to the resource group. | Integer (MB) | -1 (disabled; uses statement_mem as per-query memory limit) |
| MIN_COST | Minimum cost threshold for a query plan to be governed by the resource group. | Integer | 0 |
| IO_LIMIT | Sets I/O usage on devices to manage maximum read/write throughput and maximum read/write operations per second. Configured per tablespace. | [2 - 4294967295 or max] |
-1 |
Note!
Resource groups do not apply toSET,RESET, orSHOWcommands.
When a user runs a query, YMatrix evaluates the query state based on the limits defined for its associated resource group.
CONCURRENCY controls the maximum number of concurrent transactions allowed in a resource group. The default is 20. A value of 0 means no queries can run in this group.
If resource limits are not exceeded and the query will not exceed the concurrency limit, YMatrix executes the query immediately. If the maximum concurrency has been reached, any new transaction submitted after that point is queued until other queries complete.
CONCURRENCY specifies the maximum time a queued transaction waits before being canceled. The default is 0, meaning transactions wait indefinitely.
Bypassing Resource Group Allocation Limits
gp_resource_group_queuing_timeout enables or disables concurrency limits for the resource group, allowing queries to execute immediately. When set to gp_resource_group_bypass, queries bypass the concurrency limit. In this case, memory allocation follows true. If insufficient memory is available, the query fails. This parameter can only be set at the session level, not within a transaction or function.statement_mem determines whether system catalog queries bypass resource group limits. Default is gp_resource_group_bypass_catalog_query. Useful for GUI clients that run metadata queries. These queries run outside resource groups with a per-query memory quota of true.statement_mem controls whether direct dispatch queries bypass resource group limits. When set to gp_resource_group_bypass_direct_dispatch, such queries are not constrained by CPU or memory limits of their assigned resource group and execute immediately. Memory allocation follows true; if memory is insufficient, the query fails. This parameter can only be set at the session level, not within a transaction or function.YMatrix allocates CPU resources in two ways:
Different CPU allocation modes can be configured for different resource groups on the same cluster. However, each resource group supports only one mode at a time. The CPU allocation mode can be changed at runtime.
Use statement_mem to define the maximum percentage of system CPU resources allocated to resource groups on each segment node.
Allocating CPU Resources by Core
gp_resource_group_cpu_limit sets the CPU cores reserved for a resource group. When CPUSET is configured, YMatrix disables CPUSET and CPU_MAX_PERCENT for that group and sets their values to CPU_WEIGHT.
Usage Notes:
-1 uses core 1 on the master and cores 1, 3, and 4 on segments.Allocating CPU Resources by Percentage
'1;1,3-4' sets the upper limit on CPU usage for segments. For example, setting it to 40 means up to 40% of available CPU resources can be used. When tasks in a resource group are idle, unused CPU cycles go into a global pool that other groups can borrow from.
CPU_MAX_PERCENT sets the scheduling priority for the current group. The default is 100, range is 1–500. It defines the relative share of CPU time available to tasks in the group.
Usage Notes:
CPU_WEIGHT = 100), the first group gets 50% of total CPU time, and the other two get 25% each.CPU_MAX_PERCENT = 100) reduces the first group’s share to 33%, while the remaining three groups get approximately 16.5%, 16.5%, and 33%.Configuration Example
| Group Name | CONCURRENCY | CPU_MAX_PERCENT | CPU_WEIGHT |
|---|---|---|---|
| default_group | 20 | 50 | 10 |
| admin_group | 10 | 70 | 30 |
| system_group | 10 | 30 | 10 |
| test | 10 | 10 | 10 |
CPU_MAX_PERCENT) have a guaranteed CPU share of default_group under high load, determined by CPU_WEIGHT. They can use more when CPU is idle, up to the hard cap of 50% set by 10/(10+30+10+10)=16%.16%) have a baseline CPU share of CPU_MAX_PERCENT under high load. Their maximum usage during idle periods is capped at 70% via admin_group.30/(10+30+10+10)=50%) have a baseline CPU share of 70%. However, due to the hard cap (test) of 10%, they cannot exceed 10% even when the system is idle.10/(10+30+10+10)=16% specifies the maximum amount of memory reserved for the resource group on a Segment. This is the total memory that all active queries in the group can consume across all work processes on the Segment. Per-query memory is calculated as: CPU_MAX_PERCENT / MEMORY_QUOTA.
To allow a query to use more memory, use MEMORY_QUOTA at the session level. This overrides the memory allocated by the resource group.
Usage Notes
CONCURRENCY is set, its value bypasses the resource group's memory limit.gp_resgroup_memory_query_fixed_mem is not set, per-query memory is gp_resgroup_memory_query_fixed_mem / gp_resgroup_memory_query_fixed_mem.MEMORY_QUOTA.CONCURRENCY is reached, YMatrix raises an out-of-memory (OOM) error.Configuration Example
Consider a resource group named adhoc with MEMORY_QUOTA = 1.5 GB and statement_mem = 3. Each statement is allocated 500 MB by default. Consider the following sequence:
gp_workfile_limit_files_per_query submits query MEMORY_QUOTA, overriding CONCURRENCY to 800 MB. Query ADHOC_1 is admitted.Q1 submits query gp_resgroup_memory_query_fixed_mem, using the default 500 MB.Q1 and ADHOC_2 are running, user Q2 submits query Q3, using 500 MB.Q1 and Q2 have consumed 1300 MB of the group’s 1500 MB limit. If sufficient memory is available, ADHOC_3 can still run.Q1 submits query Q2 with Q3 = 700 MB.Q3 bypasses resource group limits, it runs immediately.Special Usage Notes
ADHOC_4 or Q4 is used to bypass resource group limits, the query’s memory limit is gp_resgroup_memory_query_fixed_mem.Q4 / gp_resource_group_bypass) < gp_resource_group_bypass_catalog_query, use statement_mem as the fixed per-query memory.MEMORY_QUOTA is capped at CONCURRENCY.statement_mem use statement_mem as their memory quota.statement_mem limits the maximum read/write disk I/O throughput and maximum IOPS per second for queries in a specific resource group. This ensures fair usage by high-priority groups and prevents excessive disk bandwidth consumption. This parameter must be set per tablespace.
Note!
max_statement_memis supported only on cgroup v2.
When configuring disk I/O limiting, use the following parameters:
* to apply limits to all tablespaces.rbps and wbps limit the maximum read and write throughput for disk I/O in the resource group, measured in MB/s. The default value is max, indicating no limit.riops and wiops limit the maximum number of read and write I/O operations per second in the resource group. The default value is max, indicating no limit.Configuration Notes
If the IO_LIMIT parameter is not set, the default values for rbps, wbps, riops, and wiops are set to max, meaning disk I/O is unlimited. If only some values of IO_LIMIT are set (e.g., rbps), the unset parameters will default to max (in this example, wbps, riops, and wiops would have the default value of max).
cgroup version configured in your environment by checking the filesystems mounted by default during system startup:stat -fc %T /sys/fs/cgroup/
For cgroup v1, the output is tmpfs. For cgroup v2, the output is cgroup2fs.
If no cgroup version change is needed, skip to Configuring cgroup v1 or Configuring cgroup v2.
To switch from cgroup v1 to v2, run as root:
grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=1"
vim /etc/default/grub
# add or modify: GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1"
update-grub
To switch from cgroup v2 to v1, run as root:
grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
vim /etc/default/grub
# add or modify: GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"
update-grub
If you wish to continue using cgroup v1, ensure that each memory.limit_in_bytes file under /sys/fs/cgroup/memory/gpdb (including /sys/fs/cgroup/memory/gpdb/memory.limit_in_bytes and /sys/fs/cgroup/memory/gpdb/[OID]/memory.limit_in_bytes) has no limit value. If it does, run:
echo -1 >> memory.limit_in_bytes
Then reboot the host for changes to take effect.
Note!
Requires superuser or a user withmemory.limit_in_bytesaccess to edit this file.
vi /etc/cgconfig.conf
Add the following to the configuration file:
group gpdb {
perm {
task {
uid = mxadmin;
gid = mxadmin;
}
admin {
uid = mxadmin;
gid = mxadmin;
}
}
cpu {
}
cpuacct {
}
cpuset {
}
memory {
}
}
This configures CPU, CPU accounting, CPU set, and memory cgroups managed by the sudo user.
cgroup service on every node in the YMatrix cluster.cgconfigparser -l /etc/cgconfig.conf
systemctl enable cgconfig.service
systemctl start cgconfig.service
cgroup directory on the node.grep cgroup /proc/mounts
cgroup directory mount point /sys/fs/cgroup.tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
ls -l <cgroup_mount_point>/cpu/gpdb
ls -l <cgroup_mount_point>/cpuset/gpdb
ls -l <cgroup_mount_point>/memory/gpdb
If the directories exist and are owned by mxadmin:mxadmin, the cgroup configuration for YMatrix database resource management has been successfully set up.
Configure the system to mount cgroups-v2 by default at boot via systemd as root:
grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"
Reboot the system for changes to take effect.
reboot now
Create directory /sys/fs/cgroup/matrixdb6.service, add required controllers, and ensure `` has read-write access:
mkdir -p /sys/fs/cgroup/matrixdb6.service
echo "+cpuset +io +cpu +memory" | tee -a /sys/fs/cgroup/cgroup.subtree_control
chown -R mxadmin:mxadmin /sys/fs/cgroup/matrixdb6.service
You may encounter an "Invalid argument" error. This occurs because cgroup v2 does not support controlling real-time processes unless all such processes are in the root cgroup. In this case, move all real-time processes to the root cgroup and re-enable the controller.
mxadmin has write permission to /sys/fs/cgroup/matrixdb6.service. This allows moving YMatrix processes to /sys/fs/cgroup/matrixdb6.service/ after cluster startup to manage postmaster and auxiliary processes.chmod a+w /sys/fs/cgroup/cgroup.procs
Since resource groups manually manage the cgroup file, these settings are lost after reboot. Add the following bash script to systemd to run automatically during system startup. Perform the following steps as the root user:
Create matrixdb6.service
vim /etc/systemd/system/matrixdb6.service
Write the following content into matrixdb6.service, replacing mxadmin with the appropriate user if different.
[Unit]
Description=Greenplum Cgroup v2 Configuration Service
[Service]
Type=simple
WorkingDirectory=/sys/fs/cgroup/matrixdb6.service
Delegate=yes
Slice=-.slice
ExecCondition=bash -c '[ xcgroup2fs = x$(stat -fc "%%T" /sys/fs/cgroup) ] || exit 1' ExecStartPre=bash -ec " \ chown -R mxadmin:mxadmin .; \ chmod a+w ../cgroup.procs; \ mkdir -p helper.scope" ExecStart=sleep infinity ExecStartPost=bash -ec "echo $MAINPID > /sys/fs/cgroup/cgroup.procs;" [Install] WantedBy=basic.target
3. Reload `systemd` daemon and enable the service:
systemctl daemon-reload systemctl enable gpdb.service
## Enabling Resource Groups
1. Set the `gp_resource_manager` server configuration parameter to `group`:
gpconfig -c gp_resource_manager -v "group"
2. Restart the YMatrix database cluster:
mxstop -arf
After enabling, any transaction submitted by a role will be directed to the resource group assigned to that role and will be subject to the concurrency, memory, and CPU limits of that group.
YMatrix creates role resource groups named `admin_group`, `default_group`, and `system_group` by default. When resource groups are enabled, any role not explicitly assigned to a resource group will be assigned to the default group based on its function. `SUPERUSER` roles are assigned to `admin_group`, `non-administrator` roles `to default_group`, and system processes to `system_group`. Note that no role can be manually assigned to `system_group`.
Default configurations:
| Parameter | admin_group | default_group | system_group |
|---------|-------------|---------------|--------------|
| CONCURRENCY | 10 | 5 | 0 |
| CPU_MAX_PERCENT | 10 | 20 | 10 |
| CPU_WEIGHT | 100 | 100 | 100 |
| CPUSET | -1 | -1 | -1 |
| IO_LIMIT | -1 | -1 | -1 |
| MEMORY_LIMIT | -1 | -1 | -1 |
| MIN_COST | 0 | 0 | 0 |
## Creating Resource Groups
The `CREATE RESOURCE GROUP` command creates a new resource group. When creating a resource group for a role, provide a name and the CPU resource allocation mode (core or percentage). You must provide a value for either `CPU_MAX_PERCENT` or `CPUSET`.
**Usage Example**
Create a resource group named `rgroup1` with CPU limit 20, memory limit 250 MB, CPU weight 500, and minimum cost 50:
CREATE RESOURCE GROUP rgroup1 WITH (CONCURRENCY=20, CPU_MAX_PERCENT=20, MEMORY_QUOTA=250, CPU_WEIGHT=500, MIN_COST=50);
CPU and memory limits are shared among all roles assigned to `rgroup1`.
The `ALTER RESOURCE GROUP` command updates the limits of a resource group.
ALTER RESOURCE GROUP rg_role_light SET CONCURRENCY 7; ALTER RESOURCE GROUP exec SET MEMORY_QUOTA 30; ALTER RESOURCE GROUP rgroup1 SET CPUSET '1;2,4';
> ***Note!***
> The `CONCURRENCY` value for `admin_group` cannot be set or changed to 0.
Use `DROP RESOURCE GROUP` to drop a resource group. The group must not be assigned to any role and must have no active or queued transactions.
DROP RESOURCE GROUP exec;
## Configuring Automatic Query Termination Based on Memory Usage
YMatrix supports Runaway detection. Queries managed by resource groups can be automatically terminated based on memory usage.
Relevant parameters:
- `gp_vmem_protect_limit`: Sets the total memory available to all postgres processes on active segment instances. Exceeding this limit causes memory allocation failure and query termination.
- `admin_group`: When resource groups are enabled, if memory usage exceeds `CONCURRENCY` × `DROP RESOURCE GROUP`, YMatrix terminates queries (except those in `system_group`) based on memory consumption, starting with the highest consumer, until usage drops below the threshold.
## Assigning Resource Groups to Roles
- Use the `RESOURCE GROUP` clause in `CREATE ROLE` or `ALTER ROLE` to assign a resource group to a database role.
ALTER ROLE bill RESOURCE GROUP rg_light; CREATE ROLE mary RESOURCE GROUP exec;
A resource group can be assigned to multiple roles. In role hierarchies, a parent role’s resource group assignment does not propagate to members.
- To remove a resource group assignment and assign the default group, set the role’s group to `NONE`:
ALTER ROLE mary RESOURCE GROUP NONE;
## Monitoring Resource Group Status
- View resource group limits
SELECT * FROM gp_toolkit.gp_resgroup_config;
- View resource group query status
SELECT * FROM gp_toolkit.gp_resgroup_status;
- View memory usage of resource groups on each host
SELECT * FROM gp_toolkit.gp_resgroup_status_per_host;
- View resource groups assigned to roles
SELECT rolname, rsgname FROM pg_roles, pg_resgroup WHERE pg_roles.rolresgroup = pg_resgroup.oid;
- View running and queued queries in resource groups
SELECT query, rsgname, wait_event_type, wait_event FROM pg_stat_activity;
- Cancel transactions running or queued in a resource group
To manually cancel a running or queued transaction, first identify the process ID (pid) associated with the transaction. After obtaining the pid, call `pg_cancel_backend()` to terminate the process.
Follow these steps:
- Run the following query to view process information for all currently active or idle statements across all resource groups. If no rows are returned, there are no running or queued transactions in any resource group.
SELECT rolname, g.rsgname, pid, waiting, state, query, datname
FROM pg_roles, gp_toolkit.gp_resgroup_status g, pg_stat_activity
WHERE pg_roles.rolresgroup = g.groupid
AND pg_stat_activity.usename = pg_roles.rolname;
- Example query output:
rolname | rsgname | pid | waiting | state | query | datname
---------+----------+---------+---------+--------+--------------------------+---------
sammy | rg_light | 31861 | f | idle | SELECT * FROM mytesttbl; | testdb
billy | rg_light | 31905 | t | active | SELECT * FROM topten; | testdb
```
SELECT pg_cancel_backend(31905);
Note!
Do not use the operating system'sKILLcommand to cancel any YMatrix database process.
Superuser role members can use the gp_toolkit.pg_resgroup_move_query() function to move a running query from one resource group to another without stopping the query. This function can improve query performance by moving long-running queries to a resource group with higher resource allocation or availability.
pg_resgroup_move_query() moves only the specified query to the target resource group. Subsequent queries submitted by the session remain assigned to the original resource group.
Note!
Only active or running queries can be moved to a new resource group. Queued or suspended queries in an idle state cannot be moved due to concurrency or memory constraints.
pg_resgroup_move_query() requires the process ID (pid) of the running query and the name of the destination resource group.
pg_resgroup_move_query(pid int4, group_name text);
As described in "Cancel transactions running or queued in a resource group", you can use the gp_toolkit.gp_resgroup_status view to list each resource group's name, ID, and status.
When pg_resgroup_move_query() is called, the running query becomes subject to the configuration limits of the target resource group, including concurrency and memory limits.
If the target resource group has reached its concurrency limit, the database queues the query until a slot becomes available, or queues it for the number of milliseconds specified by gp_resource_group_queuing_timeout if set.
If the target resource group has free slots, pg_resgroup_move_query() attempts to transfer slot ownership to the target process, waiting up to the number of milliseconds specified by gp_resource_group_move_timeout. If the target process fails to respond within gp_resource_group_queuing_timeout, the database returns an error.
If pg_resgroup_move_query() is canceled but the target process has already acquired a slot, the segment process does not move to the new group, and the target process retains the slot. This inconsistent state is resolved when the transaction ends or when the next command is dispatched by the target process within the same transaction.
If the target resource group does not have sufficient available memory to meet the query’s current memory requirements, the database returns an error. You may either increase the shared memory allocated to the target resource group or wait until some running queries complete before calling the function again.
After moving a query, there is no guarantee that currently running queries in the target resource group will stay within the group's memory quota. In such cases, one or more running queries in the target group—including the moved query—may fail. To minimize this risk, reserve sufficient global shared memory for the resource group.