systemd having issues with cgroups: Operation not permitted on devices.list
Another SkySilk beta user came to me asking to help fix a "mistake" with his instance of OpenVPN that he installed on a Debian 9 VPS.
I looked into it and, the problem was not at all with his OpenVPN configuration: I was able to start it on the command line using the same configuration just fine, and continued to stop and start it over the course of two weeks, it never failed or had any issues while running.
So as a systemd service, it "failed" to start only because systemd tries to do something with a 'devices.list' entry in a cgroup mounted on /sys/fs/cgroup/devices, and when that fails, it appears to just give up without even trying to start the OpenVPN program.
The exact error message, visible under /var/log/daemon.log or by running "service openvpn status" is
openvpn.service: Failed to reset devices.list: Operation not permitted
Here's the real mystery: on his test server, there were many errors for different services, some were custom-installed but others were part of the basic Debian package (like crond or networking.service), and systemd was encountering the same error for all those programs but still running them. I have no idea why it specifically chose to not start OpenVPN.
Further, the last effort to explore this problem resulted in a sudden change: now OpenVPN does get started by systemd, even though the error message still shows up. It has become a silent error, just like the other programs that systemd appears to start regardless.
I won't be able to look into this VPS any further as the Beta period has officially ended, but I will keep an eye out for this if anyone on the paid side of things is having trouble starting a program as a systemd service.
If it is true that systemd will sometimes decide not to start a service because of the devices.list error (and not something completely unexpected) than it should be taken seriously even if programs usually do start anyway -- that means there is a risk of software breakage in the future.
Here are some examples of the error messages from the Beta VPS:
/var/log/daemon.log: ... Sep 30 06:32:33 Beta systemd: rsyslog.service: Failed to reset devices.list: Operation not permitted ... Sep 30 06:32:33 Beta systemd: cron.service: Failed to reset devices.list: Operation not permitted ... Sep 30 06:32:33 Beta systemd: networking.service: Failed to reset devices.lis t: Operation not permitted ... Sep 30 06:32:33 Beta systemd: openvpn.service: Failed to reset devices.list: Operation not permitted
I am not experienced with systemd or cgroups, so I can't figure out what operation is not permitted. For the openvpn.service, 3 devices.list files exist in paths under /sys/fs/cgroup/device with openvpn in the path name. When I tell systemd to stop the service, 2 of those will disappear. I have tried to delete the last one, and I get an "Operation not permitted" error, but that seems to be meaningless as I also cannot delete the other 2 which systemd mysteriously can.
My best guess is that there is something about that last devices.list file that makes it un-removable through whatever mechanism systemd is able to remove the other 2, but I can't confirm that as I don't know how to make a devices.list file disappear inside a cgroup.
2 people have this problem