Configuring Resource Groups

Verify the cgroup version configured in your environment by checking the filesystems mounted by default during system boot:

stat -fc %T /sys/fs/cgroup/

For cgroup v1, the output is tmpfs. For cgroup v2, the output is cgroup2fs.

If you do not need to change the cgroup version, skip directly to Configuring cgroup v1 or Configuring cgroup v2 to complete the setup.

To switch from cgroup v1 to v2, run the following command as root:

  • Red Hat 8 / Rocky 8 / Oracle 8:
    grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=1"
  • Ubuntu:
    vim /etc/default/grub
    # add or modify: GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1"
    update-grub

To switch from cgroup v2 to v1, run the following command as root:

  • Red Hat 8 / Rocky 8 / Oracle 8:
    grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
  • Ubuntu:
    vim /etc/default/grub
    # add or modify: GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"
    update-grub

If you intend to continue using cgroup v1, ensure that no limit is set in any memory.limit_in_bytes file under /sys/fs/cgroup/memory/gpdb/, including both /sys/fs/cgroup/memory/gpdb/memory.limit_in_bytes and /sys/fs/cgroup/memory/gpdb/[OID]/memory.limit_in_bytes. If a limit exists, run:

echo -1 > memory.limit_in_bytes

Then reboot the host for the changes to take effect.

Configuring cgroup v1

  1. On every node in the cluster:

Note!
You must use a superuser or a user with sudo privileges to edit this file.

vi /etc/cgconfig.conf

Add the following content to the configuration file:

group gpdb {
     perm {
         task {
             uid = mxadmin;
             gid = mxadmin;
         }
         admin {
             uid = mxadmin;
             gid = mxadmin;
         }
     }
     cpu {
     }
     cpuacct {
     }
     cpuset {
     }
     memory {
     }
}

This configuration enables CPU bandwidth control, CPU usage accounting, CPU core affinity, and memory limits managed by the mxadmin user.

  1. Enable the cgroup service on every node of the YMatrix cluster:
cgconfigparser -l /etc/cgconfig.conf
systemctl enable cgconfig.service
systemctl start cgconfig.service
  1. Identify the cgroup mount point on the node:
grep cgroup /proc/mounts
  • The first line of output shows the cgroup mount point at /sys/fs/cgroup:
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
  1. Verify the configuration:
ls -l <cgroup_mount_point>/cpu/gpdb
ls -l <cgroup_mount_point>/cpuset/gpdb
ls -l <cgroup_mount_point>/memory/gpdb

If these directories exist and are owned by mxadmin:mxadmin, then cgroup has been successfully configured for YMatrix resource management.

Configuring cgroup v2

  1. Configure the system to mount cgroups-v2 by default at boot using the systemd manager (as root):
grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"
  1. Reboot the system to apply the change:
reboot now
  1. Create the directory /sys/fs/cgroup/matrixdb6.service, enable all required controllers, and ensure the mxadmin user has read-write access:
mkdir -p /sys/fs/cgroup/matrixdb6.service
echo "+cpuset +io +cpu +memory" | tee -a /sys/fs/cgroup/cgroup.subtree_control
chown -R mxadmin:mxadmin /sys/fs/cgroup/matrixdb6.service

You may encounter an "invalid argument" error when running the above commands. This occurs because cgroup v2 does not support controlling real-time processes unless all real-time processes reside in the root cgroup. In this case, identify all real-time processes, move them into the root cgroup, and then re-enable the controllers.

Ensure mxadmin has write permission to /sys/fs/cgroup/cgroup.procs. This allows YMatrix processes to be moved from the user slice into /sys/fs/cgroup/matrixdb6.service/ after cluster startup, enabling management of the postmaster process and its auxiliary processes:

chmod a+w /sys/fs/cgroup/cgroup.procs

Since resource groups manage cgroup files manually, this permission setting will be lost after a system reboot. To make it persistent, configure systemd to run a script automatically at boot. Perform the following steps as root:

  1. Create the matrixdb6.service unit file:

    vim /etc/systemd/system/matrixdb6.service
  2. Add the following content to matrixdb6.service. Replace mxadmin with the appropriate user if different:

[Unit]
Description=Greenplum Cgroup v2 Configuration Service
[Service]
Type=simple
WorkingDirectory=/sys/fs/cgroup/matrixdb6.service
Delegate=yes
Slice=-.slice

# set hierarchies only if cgroup v2 mounted
ExecCondition=bash -c '[ xcgroup2fs = x$(stat -fc "%%T" /sys/fs/cgroup) ] || exit 1'
ExecStartPre=bash -ec " \
chown -R mxadmin:mxadmin .; \
chmod a+w ../cgroup.procs; \
mkdir -p helper.scope"
ExecStart=sleep infinity
ExecStartPost=bash -ec "echo $MAINPID > /sys/fs/cgroup/cgroup.procs;"
[Install]
WantedBy=basic.target
  1. Reload the systemd daemon and enable the service:
systemctl daemon-reload
systemctl enable matrixdb6.service