This Article IsCreated at 2024-05-03Last Modified at 2024-05-03Referenced as ia.www.f33

Linux: Only One New PID Namespace Per Process

I have the general feeling that Linux does not like sandboxing.

I was trying to write my own process supervisor with pid namespace, and, in Erlang/OTP, one supervisor can have multiple supervisors as children. On Linux though… not so easy. (Yes if the main process fork, (in child) unshare, fork, wait.)

After unshare, the process can only fork once.

unshare(CLONE_NEWPID); // ok
fork(); // ok
fork(); // ENOMEM

The process cannot unshare twice.

unshare(CLONE_NEWPID); // ok
fork(); // ok
unshare(CLONE_NEWPID); // EINVAL

Run the source code yourself and see if you can overcome this limitation: https://git.envs.net/iacore/newpid-supervisor

Also, as a sidenote, you need CAP_SYS_ADMIN to use unshare(CLONE_NEWPID). This is either done with root, or, in bwrap’s case, creating an unprivileged uid/gid sandbox that maps a subuid to 0/root.

The feeling is mutual.

Capsicum is good. Use capsicum.