Resource Group Usage

This document describes how to use resource groups to manage transaction concurrency, CPU, and memory resource allocation in YMatrix. After defining a resource group, you can assign it to one or more database roles to control the resources available to those roles.

YMatrix uses Linux-based control groups (cgroups) to manage database resources and employs Runaway detection for memory accounting, tracking, and management.

Related Parameters

The following table lists configurable resource group parameters:

Parameter	Description	Range	Default
CONCURRENCY	Maximum number of concurrent transactions allowed in the resource group, including active and idle transactions.	[0 - max_connections]	20
CPU_MAX_PERCENT	Maximum percentage of CPU resources the resource group can use.	[1 - 100]	-1 (disabled)
CPU_WEIGHT	Scheduling priority for the resource group.	[1 - 500]	100
CPUSET	Specific CPU logical cores (or logical threads in hyper-threading) reserved for this resource group.	System-dependent	-1
MEMORY_QUOTA	Memory limit assigned to the resource group.	Integer (MB)	-1 (disabled; uses `statement_mem` as per-query memory limit)
MIN_COST	Minimum cost threshold for a query plan to be governed by the resource group.	Integer	0
IO_LIMIT	Sets I/O usage on devices to manage maximum read/write throughput and maximum read/write operations per second. Configured per tablespace.	[2 - 4294967295 or `max`]	-1

Note!
Resource groups do not apply to SET, RESET, or SHOW commands.

Configuration Details

When a user runs a query, YMatrix evaluates the query state based on the limits defined for its associated resource group.

Concurrency Limits

CONCURRENCY controls the maximum number of concurrent transactions allowed in a resource group. The default is 20. A value of 0 means no queries can run in this group.
If resource limits are not exceeded and the query will not exceed the concurrency limit, YMatrix executes the query immediately. If the maximum concurrency has been reached, any new transaction submitted after that point is queued until other queries complete.

CONCURRENCY specifies the maximum time a queued transaction waits before being canceled. The default is 0, meaning transactions wait indefinitely.

Bypassing Resource Group Allocation Limits

gp_resource_group_queuing_timeout enables or disables concurrency limits for the resource group, allowing queries to execute immediately. When set to gp_resource_group_bypass, queries bypass the concurrency limit. In this case, memory allocation follows true. If insufficient memory is available, the query fails. This parameter can only be set at the session level, not within a transaction or function.
statement_mem determines whether system catalog queries bypass resource group limits. Default is gp_resource_group_bypass_catalog_query. Useful for GUI clients that run metadata queries. These queries run outside resource groups with a per-query memory quota of true.
statement_mem controls whether direct dispatch queries bypass resource group limits. When set to gp_resource_group_bypass_direct_dispatch, such queries are not constrained by CPU or memory limits of their assigned resource group and execute immediately. Memory allocation follows true; if memory is insufficient, the query fails. This parameter can only be set at the session level, not within a transaction or function.

CPU Limits

YMatrix allocates CPU resources in two ways:

Assigning a specified percentage of CPU resources
Assigning specific CPU cores

Different CPU allocation modes can be configured for different resource groups on the same cluster. However, each resource group supports only one mode at a time. The CPU allocation mode can be changed at runtime.

Use statement_mem to define the maximum percentage of system CPU resources allocated to resource groups on each segment node.

Allocating CPU Resources by Core

gp_resource_group_cpu_limit sets the CPU cores reserved for a resource group. When CPUSET is configured, YMatrix disables CPUSET and CPU_MAX_PERCENT for that group and sets their values to CPU_WEIGHT.

Usage Notes:

Use semicolons (;) to separate master and segment node core specifications. Use commas (,) to separate individual core numbers or ranges, enclosed in single quotes (' '). For example, -1 uses core 1 on the master and cores 1, 3, and 4 on segments.
Avoid using CPU core 0. When assigning cores, prefer lower-numbered cores. If replacing a node with fewer cores than the original, or restoring a backup to a cluster with fewer cores, operations may fail. For example, if the cluster has 16 cores, assigning cores 1–7 is optimal. Assigning core 9 would prevent recovery on an 8-core node.

Allocating CPU Resources by Percentage

'1;1,3-4' sets the upper limit on CPU usage for segments. For example, setting it to 40 means up to 40% of available CPU resources can be used. When tasks in a resource group are idle, unused CPU cycles go into a global pool that other groups can borrow from.

CPU_MAX_PERCENT sets the scheduling priority for the current group. The default is 100, range is 1–500. It defines the relative share of CPU time available to tasks in the group.

Usage Notes:

If one group has a relative share of 100 and two others have 50 each, and all groups try to use 100% CPU (i.e., all have CPU_WEIGHT = 100), the first group gets 50% of total CPU time, and the other two get 25% each.
Adding another group with a relative share of 100 (CPU_MAX_PERCENT = 100) reduces the first group’s share to 33%, while the remaining three groups get approximately 16.5%, 16.5%, and 33%.

Configuration Example

Group Name	CONCURRENCY	CPU_MAX_PERCENT	CPU_WEIGHT
default_group	20	50	10
admin_group	10	70	30
system_group	10	30	10
test	10	10	10

Roles in the default group (CPU_MAX_PERCENT) have a guaranteed CPU share of default_group under high load, determined by CPU_WEIGHT. They can use more when CPU is idle, up to the hard cap of 50% set by 10/(10+30+10+10)=16%.
Roles in the admin group (16%) have a baseline CPU share of CPU_MAX_PERCENT under high load. Their maximum usage during idle periods is capped at 70% via admin_group.
Roles in the test group (30/(10+30+10+10)=50%) have a baseline CPU share of 70%. However, due to the hard cap (test) of 10%, they cannot exceed 10% even when the system is idle.

Memory Limits

10/(10+30+10+10)=16% specifies the maximum amount of memory reserved for the resource group on a Segment. This is the total memory that all active queries in the group can consume across all work processes on the Segment. Per-query memory is calculated as: CPU_MAX_PERCENT / MEMORY_QUOTA.

To allow a query to use more memory, use MEMORY_QUOTA at the session level. This overrides the memory allocated by the resource group.

Usage Notes

If CONCURRENCY is set, its value bypasses the resource group's memory limit.
If gp_resgroup_memory_query_fixed_mem is not set, per-query memory is gp_resgroup_memory_query_fixed_mem / gp_resgroup_memory_query_fixed_mem.
If neither is set, the default per-query memory is MEMORY_QUOTA.
All queries spill to disk if insufficient memory is available. When CONCURRENCY is reached, YMatrix raises an out-of-memory (OOM) error.

Configuration Example
Consider a resource group named adhoc with MEMORY_QUOTA = 1.5 GB and statement_mem = 3. Each statement is allocated 500 MB by default. Consider the following sequence:

User gp_workfile_limit_files_per_query submits query MEMORY_QUOTA, overriding CONCURRENCY to 800 MB. Query ADHOC_1 is admitted.
User Q1 submits query gp_resgroup_memory_query_fixed_mem, using the default 500 MB.
While Q1 and ADHOC_2 are running, user Q2 submits query Q3, using 500 MB.
Queries Q1 and Q2 have consumed 1300 MB of the group’s 1500 MB limit. If sufficient memory is available, ADHOC_3 can still run.
User Q1 submits query Q2 with Q3 = 700 MB.
Since Q3 bypasses resource group limits, it runs immediately.

Special Usage Notes

If ADHOC_4 or Q4 is used to bypass resource group limits, the query’s memory limit is gp_resgroup_memory_query_fixed_mem.
When (Q4 / gp_resource_group_bypass) < gp_resource_group_bypass_catalog_query, use statement_mem as the fixed per-query memory.
The maximum value of MEMORY_QUOTA is capped at CONCURRENCY.
Queries with a plan cost below statement_mem use statement_mem as their memory quota.

Disk I/O Limits

statement_mem limits the maximum read/write disk I/O throughput and maximum IOPS per second for queries in a specific resource group. This ensures fair usage by high-priority groups and prevents excessive disk bandwidth consumption. This parameter must be set per tablespace.

Note!
max_statement_mem is supported only on cgroup v2.

When configuring disk I/O limiting, use the following parameters:

Set the name of the tablespace or its Object ID (OID) that requires limitation. Use * to apply limits to all tablespaces.
rbps and wbps limit the maximum read and write throughput for disk I/O in the resource group, measured in MB/s. The default value is max, indicating no limit.
riops and wiops limit the maximum number of read and write I/O operations per second in the resource group. The default value is max, indicating no limit.

Configuration Notes

If the IO_LIMIT parameter is not set, the default values for rbps, wbps, riops, and wiops are set to max, meaning disk I/O is unlimited. If only some values of IO_LIMIT are set (e.g., rbps), the unset parameters will default to max (in this example, wbps, riops, and wiops would have the default value of max).

Configuring Resource Groups

Verify the cgroup version configured in your environment by checking the filesystems mounted by default during system startup:

stat -fc %T /sys/fs/cgroup/

For cgroup v1, the output is tmpfs. For cgroup v2, the output is cgroup2fs.

If no cgroup version change is needed, skip to Configuring cgroup v1 or Configuring cgroup v2.

To switch from cgroup v1 to v2, run as root:

Red Hat 8/Rocky 8/Oracle 8

grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=1"

Ubuntu

vim /etc/default/grub
# add or modify: GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1"
update-grub

To switch from cgroup v2 to v1, run as root:

Red Hat 8/Rocky 8/Oracle 8

grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"

Ubuntu

vim /etc/default/grub
# add or modify: GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"
update-grub

If you wish to continue using cgroup v1, ensure that each memory.limit_in_bytes file under /sys/fs/cgroup/memory/gpdb (including /sys/fs/cgroup/memory/gpdb/memory.limit_in_bytes and /sys/fs/cgroup/memory/gpdb/[OID]/memory.limit_in_bytes) has no limit value. If it does, run:

echo -1 >> memory.limit_in_bytes

Then reboot the host for changes to take effect.

Configuring cgroup v1

On each node in the cluster:

Note!
Requires superuser or a user with memory.limit_in_bytes access to edit this file.

vi /etc/cgconfig.conf

Add the following to the configuration file:

group gpdb {
     perm {
         task {
             uid = mxadmin;
             gid = mxadmin;
         }
         admin {
             uid = mxadmin;
             gid = mxadmin;
         }
     }
     cpu {
     }
     cpuacct {
     }
     cpuset {
     }
     memory {
     }
}

This configures CPU, CPU accounting, CPU set, and memory cgroups managed by the sudo user.

Start the cgroup service on every node in the YMatrix cluster.

cgconfigparser -l /etc/cgconfig.conf 
systemctl enable cgconfig.service
systemctl start cgconfig.service

Determine the mount point of the cgroup directory on the node.

grep cgroup /proc/mounts

The first line of output shows the cgroup directory mount point /sys/fs/cgroup.

tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0                                                             
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0                                                                                                      
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0                                           
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0                         
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0                                                 
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0                                           
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0                                   
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0                                     
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0                                           
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0                                             
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0                                             
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0

Verify correct configuration:

ls -l <cgroup_mount_point>/cpu/gpdb
ls -l <cgroup_mount_point>/cpuset/gpdb
ls -l <cgroup_mount_point>/memory/gpdb

If the directories exist and are owned by mxadmin:mxadmin, the cgroup configuration for YMatrix database resource management has been successfully set up.

Configuring cgroup v2

Configure the system to mount cgroups-v2 by default at boot via systemd as root:
```
grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"
```
Reboot the system for changes to take effect.
```
reboot now
```

Create directory /sys/fs/cgroup/matrixdb6.service, add required controllers, and ensure `` has read-write access:

mkdir -p /sys/fs/cgroup/matrixdb6.service
echo "+cpuset +io +cpu +memory" | tee -a /sys/fs/cgroup/cgroup.subtree_control
chown -R mxadmin:mxadmin /sys/fs/cgroup/matrixdb6.service

You may encounter an "Invalid argument" error. This occurs because cgroup v2 does not support controlling real-time processes unless all such processes are in the root cgroup. In this case, move all real-time processes to the root cgroup and re-enable the controller.

Ensure mxadmin has write permission to /sys/fs/cgroup/matrixdb6.service. This allows moving YMatrix processes to /sys/fs/cgroup/matrixdb6.service/ after cluster startup to manage postmaster and auxiliary processes.
```
chmod a+w /sys/fs/cgroup/cgroup.procs
```

Since resource groups manually manage the cgroup file, these settings are lost after reboot. Add the following bash script to systemd to run automatically during system startup. Perform the following steps as the root user:

Create matrixdb6.service

vim /etc/systemd/system/matrixdb6.service

Write the following content into matrixdb6.service, replacing mxadmin with the appropriate user if different.


[Unit]
Description=Greenplum Cgroup v2 Configuration Service
[Service]
Type=simple
WorkingDirectory=/sys/fs/cgroup/matrixdb6.service
Delegate=yes
Slice=-.slice

set hierarchies only if cgroup v2 mounted

ExecCondition=bash -c '[ xcgroup2fs = x$(stat -fc "%%T" /sys/fs/cgroup) ] || exit 1' ExecStartPre=bash -ec " \ chown -R mxadmin:mxadmin .; \ chmod a+w ../cgroup.procs; \ mkdir -p helper.scope" ExecStart=sleep infinity ExecStartPost=bash -ec "echo $MAINPID > /sys/fs/cgroup/cgroup.procs;" [Install] WantedBy=basic.target


3. Reload `systemd` daemon and enable the service:

systemctl daemon-reload systemctl enable gpdb.service


## Enabling Resource Groups

1. Set the `gp_resource_manager` server configuration parameter to `group`:

gpconfig -c gp_resource_manager -v "group"


2. Restart the YMatrix database cluster:

mxstop -arf


After enabling, any transaction submitted by a role will be directed to the resource group assigned to that role and will be subject to the concurrency, memory, and CPU limits of that group.

YMatrix creates role resource groups named `admin_group`, `default_group`, and `system_group` by default. When resource groups are enabled, any role not explicitly assigned to a resource group will be assigned to the default group based on its function. `SUPERUSER` roles are assigned to `admin_group`, `non-administrator` roles `to default_group`, and system processes to `system_group`. Note that no role can be manually assigned to `system_group`.

Default configurations:

| Parameter | admin_group | default_group | system_group |
|---------|-------------|---------------|--------------|
| CONCURRENCY | 10 | 5 | 0 |
| CPU_MAX_PERCENT | 10 | 20 | 10 |
| CPU_WEIGHT | 100 | 100 | 100 |
| CPUSET | -1 | -1 | -1 |
| IO_LIMIT | -1 | -1 | -1 |
| MEMORY_LIMIT | -1 | -1 | -1 |
| MIN_COST | 0 | 0 | 0 |

## Creating Resource Groups

The `CREATE RESOURCE GROUP` command creates a new resource group. When creating a resource group for a role, provide a name and the CPU resource allocation mode (core or percentage). You must provide a value for either `CPU_MAX_PERCENT` or `CPUSET`.

**Usage Example**  
Create a resource group named `rgroup1` with CPU limit 20, memory limit 250 MB, CPU weight 500, and minimum cost 50:

CREATE RESOURCE GROUP rgroup1 WITH (CONCURRENCY=20, CPU_MAX_PERCENT=20, MEMORY_QUOTA=250, CPU_WEIGHT=500, MIN_COST=50);


CPU and memory limits are shared among all roles assigned to `rgroup1`.

The `ALTER RESOURCE GROUP` command updates the limits of a resource group.

ALTER RESOURCE GROUP rg_role_light SET CONCURRENCY 7; ALTER RESOURCE GROUP exec SET MEMORY_QUOTA 30; ALTER RESOURCE GROUP rgroup1 SET CPUSET '1;2,4';


> ***Note!***  
> The `CONCURRENCY` value for `admin_group` cannot be set or changed to 0.

Use `DROP RESOURCE GROUP` to drop a resource group. The group must not be assigned to any role and must have no active or queued transactions.

DROP RESOURCE GROUP exec;


## Configuring Automatic Query Termination Based on Memory Usage

YMatrix supports Runaway detection. Queries managed by resource groups can be automatically terminated based on memory usage.

Relevant parameters:
- `gp_vmem_protect_limit`: Sets the total memory available to all postgres processes on active segment instances. Exceeding this limit causes memory allocation failure and query termination.
- `admin_group`: When resource groups are enabled, if memory usage exceeds `CONCURRENCY` × `DROP RESOURCE GROUP`, YMatrix terminates queries (except those in `system_group`) based on memory consumption, starting with the highest consumer, until usage drops below the threshold.

## Assigning Resource Groups to Roles

- Use the `RESOURCE GROUP` clause in `CREATE ROLE` or `ALTER ROLE` to assign a resource group to a database role.

ALTER ROLE bill RESOURCE GROUP rg_light; CREATE ROLE mary RESOURCE GROUP exec;


A resource group can be assigned to multiple roles. In role hierarchies, a parent role’s resource group assignment does not propagate to members.

- To remove a resource group assignment and assign the default group, set the role’s group to `NONE`:

ALTER ROLE mary RESOURCE GROUP NONE;



## Monitoring Resource Group Status

- View resource group limits

SELECT * FROM gp_toolkit.gp_resgroup_config;


- View resource group query status

SELECT * FROM gp_toolkit.gp_resgroup_status;


- View memory usage of resource groups on each host

SELECT * FROM gp_toolkit.gp_resgroup_status_per_host;


- View resource groups assigned to roles

SELECT rolname, rsgname FROM pg_roles, pg_resgroup WHERE pg_roles.rolresgroup = pg_resgroup.oid;


- View running and queued queries in resource groups

SELECT query, rsgname, wait_event_type, wait_event FROM pg_stat_activity;


- Cancel transactions running or queued in a resource group  
To manually cancel a running or queued transaction, first identify the process ID (pid) associated with the transaction. After obtaining the pid, call `pg_cancel_backend()` to terminate the process.

Follow these steps:

  - Run the following query to view process information for all currently active or idle statements across all resource groups. If no rows are returned, there are no running or queued transactions in any resource group.

SELECT rolname, g.rsgname, pid, waiting, state, query, datname 
FROM pg_roles, gp_toolkit.gp_resgroup_status g, pg_stat_activity 
WHERE pg_roles.rolresgroup = g.groupid
AND pg_stat_activity.usename = pg_roles.rolname;


  - Example query output:

rolname | rsgname  | pid     | waiting | state  |          query           | datname 
---------+----------+---------+---------+--------+--------------------------+---------
  sammy  | rg_light |  31861  |    f    | idle   | SELECT * FROM mytesttbl; | testdb
  billy  | rg_light |  31905  |    t    | active | SELECT * FROM topten;    | testdb
```

Terminate the transaction process:
```
SELECT pg_cancel_backend(31905);
```

Note!
Do not use the operating system's KILL command to cancel any YMatrix database process.

Move Queries to a Different Resource Group

Superuser role members can use the gp_toolkit.pg_resgroup_move_query() function to move a running query from one resource group to another without stopping the query. This function can improve query performance by moving long-running queries to a resource group with higher resource allocation or availability.

pg_resgroup_move_query() moves only the specified query to the target resource group. Subsequent queries submitted by the session remain assigned to the original resource group.

Note!
Only active or running queries can be moved to a new resource group. Queued or suspended queries in an idle state cannot be moved due to concurrency or memory constraints.

pg_resgroup_move_query() requires the process ID (pid) of the running query and the name of the destination resource group.

pg_resgroup_move_query(pid int4, group_name text);

As described in "Cancel transactions running or queued in a resource group", you can use the gp_toolkit.gp_resgroup_status view to list each resource group's name, ID, and status.

When pg_resgroup_move_query() is called, the running query becomes subject to the configuration limits of the target resource group, including concurrency and memory limits.

If the target resource group has reached its concurrency limit, the database queues the query until a slot becomes available, or queues it for the number of milliseconds specified by gp_resource_group_queuing_timeout if set.
If the target resource group has free slots, pg_resgroup_move_query() attempts to transfer slot ownership to the target process, waiting up to the number of milliseconds specified by gp_resource_group_move_timeout. If the target process fails to respond within gp_resource_group_queuing_timeout, the database returns an error.
If pg_resgroup_move_query() is canceled but the target process has already acquired a slot, the segment process does not move to the new group, and the target process retains the slot. This inconsistent state is resolved when the transaction ends or when the next command is dispatched by the target process within the same transaction.
If the target resource group does not have sufficient available memory to meet the query’s current memory requirements, the database returns an error. You may either increase the shared memory allocated to the target resource group or wait until some running queries complete before calling the function again.

After moving a query, there is no guarantee that currently running queries in the target resource group will stay within the group's memory quota. In such cases, one or more running queries in the target group—including the moved query—may fail. To minimize this risk, reserve sufficient global shared memory for the resource group.

Release History

English Русский 简体中文