'error: cgroup namespace 'freezer' not mounted. aborting
Trying to run slurmd:
sudo systemctl start slurmd
I display the status of the daemon and an error is displayed on the screen:
>>sudo systemctl status slurmd
● slurmd.service - Slurm node daemon
Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2020-06-29 18:13:06 MSK; 2s ago
Docs: man:slurmd(8)
Process: 13402 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=1/FAILURE)
июн 29 18:13:06 ecm systemd[1]: Starting Slurm node daemon...
июн 29 18:13:06 ecm slurmd-ecm[13402]: Message aggregation disabled
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: cgroup namespace 'freezer' not mounted. aborting
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: unable to create freezer cgroup namespace
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: Couldn't load specified plugin name for proctrack/cgroup: Plugin init() callback failed
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: cannot create proctrack context for proctrack/cgroup
июн 29 18:13:06 ecm systemd[1]: slurmd.service: Control process exited, code=exited, status=1/FAILURE
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: slurmd initialization failed
июн 29 18:13:06 ecm systemd[1]: slurmd.service: Failed with result 'exit-code'.
июн 29 18:13:06 ecm systemd[1]: Failed to start Slurm node daemon.
I don't know how to fix it. I hope for your help. I use slurm version 18.08.05 and debian 10.
UPD. I changed the ProctrackType value in slurm.config to proctrack/linuxproc:
ProctrackType=proctrack/linuxproc
All is work.
Solution 1:[1]
Unlike the documentation (man cgroup.conf), the default value of the parameter CgroupMountpoint is not good.
echo CgroupMountpoint=/sys/fs/cgroup >> /etc/slurm-llnl/cgroup.conf
And you can reset the value of ProctrackType. Tested on Debian10.7 slurmd version: slurm-wlm 18.08.5-2
Solution 2:[2]
In my case, this happened because I didn't create and configure my cgroup.conf on the nodes running slurmd. Once this was added to the same directory as slurm.conf, it worked fine. CgroupMountpoint did not need to be defined as the default was sufficient.
Solution 3:[3]
Same error in my cluster, my cgroup.conf wasn't configured.
A simple /etc/slurm/cgroup.conf with
CgroupAutomount=yes
ConstrainCores=no
ConstrainRAMSpace=no
then
systemctl slurmd restart
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Adam Maulis |
Solution 2 | Paul |
Solution 3 | gaeldb |