killing trees of Unix processes

Unix semantics around processes and signals are a bit gnarly.

One issue I have with signals is that they conflate doing some simple IPC with a process, and doing something to a process.  In other words, the catchable signals and the uncatchable signals (and perhaps uncaught catchable signals -- it would depend on the design) are such categorically different items, that they should not be lumped together under "signals".  You want to kill a process, you kill it.  It doesn't have a say.  You want to let a process know about something, request it do something, whatever, great, you talk to it, that's a different thing.

An issue I have with processes is that they are apparently hierarchically organised, via each process's parent process id, which is established at fork time, but this hierarchy is hard to make use of, and easily falls apart.  

Trees of processes naturally represent useful things.  The OS as a whole is the tree rooted at (conventionally) init, with its PID of 1.  Trees managed under process supervisors like svscan or systemd represent servers.  The tree under a text console login process is a user text console session.  The tree under a display manager may be a user's graphical session.  A job in a shell, including foreground jobs, are their own trees.  And an application instance, if so architected, will be its own process tree.

It's clear from the above that these trees arise naturally from various aspects of the system's operation.  Just as trees of processes get started together, so one wants to stop them together as logical units, and this is where things are not so nice.

In general there is no easy way to signal a subtree of processes rooted at a given PID.

Example time.  mupdf is implemented as a shell script, calling some binary, so you end up with two levels in the process hierarchy: an extra sh, and then some binary like mupdfx11.  Press ctrl-C in the shell terminal, and it will quit as expected, meaning this little process tree is all gone.  But send an INT signal to the intermediate sh (equivalent to ctrl-C) via the kill command instead, and just this sh command exits, observable in the terminal: the prompt comes back.  The mupdf graphical application, which was in the same process tree, keeps going.  Why?  The shell is not signalling an individual process, but a process group.

The shell in the terminal originally launched the process tree arising from the user's mupdf command in its own process group, for example by calling setpgrp(2) in the first child before doing any exec-ing or further fork-ing.  I think this is known as a "job", but I'm not clear on the distinction between job and process group (or indeed session, to be honest).  When the user does ctrl-C, the INT signal is generated by the terminal and sent to the shell, which catches it (this bit isn't important here).  The shell then sends INTs to processes in the job's process group.  It can do this all at once, because kill is defined so that a negative integer means send the signal to every process in that process group.

This illustrates one issue with the design: programs have to set up job control deliberately, by setting the process group id at the right points in the tree.  You can't just kill an arbitrary subtree.  Arbitrary subtrees just seems like a much cleaner and smaller design.

You can try selecting the processes in the tree programmatically, because each process has a parent pid (it doesn't have child pids!  So you have to look at all the processes ..).  But there is a race condition: you don't know what processes have come or gone while you walked the tree.  kill(-pgid) is at least an atomic way of saying what you want, even if the resulting actions are not going to be (can't be) atomic.

If a parent process exits, then its child processes' parent-PIDs are set to 1, which is init.  This is the main way the process hierarchy falls apart.  There is a clear use-case for child processes outliving parent processes (think server restarts and worker subprocesses in-progress serving requests / sessions), but it's not clear it should happen by default.  It's easy to end up with processes of unknown provenance, effectively outside the tree (they are arbitrarily attached at the top level as a child of process 1).

Why are processes reparented straight to init?  Why not "re-attach" them at the closest point in the hierarchy that still exists?  If their grandparent process still exists, and the parent process exits, reparent it to the grandparent (please note: "exists" and "exits" are different words in the above, with almost opposite meanings).

The possibility of breaking out of the process tree has led to the contagious brain cancer that is daemons all doing the "detachment dance", and maybe offering a special way to disable this for anyone sane.

Another point for working with the existing process group setup.  When the process group id set via setpgid(), it becomes the current process's PID.  So, if you know which process is the group leader, as it's called, then you don't have to track the process group id separately.  It's just the PID.

And why is no one playing with basic OS design any more?

PS: to adapt my view-and-rename-pdfs script, in perl, to properly terminate the pdf viewer once it's been renamed, and not just kill some intermediate process, the changes were tiny.

In the parent process there's now a minus sign when signalling the child by PID (now PGID) with kill, so:

    if (my $child = fork) {
                ...
        kill('KILL', $child) or die "kill: $!\n";
                ...
            }

changed to 

kill('KILL', -$child) or die "kill: $!\n";

and in the child process, there's now a setpgrp before doing the exec:

    else {
setpgrp;
exec("mupdf", $f_full) or die;
    }

Comments

Popular posts from this blog

the persistent idiocy of "privileged ports" on Unix

google is giving more and more 500 errors

7 minute workout: a straightforward audio recording (and two broken google web sites)