On Unix systems, fork
is a rather simple system call.
Our implementation in glibc is and needs to be rather bulky.
For example, it has to duplicate all port rights for the new Mach task. The address space can simply be duplicated by standard means of the Mach, but as file descriptors (for example) are a concept that is implemented inside glibc (based on Mach ports), these have to be duplicated from userspace, which requires a small number of RPCs for each of them, and in the sum, this affects performance when new processes are continuously being spawned from the shell, for example.
Often, a fork
call will eventually be followed by an exec
, which may in
turn close (most of) the
duplicated port rights. Unfortunately, this cannot be known at the time the
fork
executing, so in order to optimize this, the code calling fork
has to
be modified instead, and the fork
, exec
combo be replaced by a
posix_spawn
call, for example, to avoid this work of duplicating each port
right, then closing each again.
As far as we know, Cygwin has the same problem of fork
being a nontrivial
operation. Perhaps we can learn from what they're been doing? Also, perhaps
they have patches for software packages, to avoid using fork
followed by
exec
, for example.
TODO
fork: mach port mod refs: EKERN UREFS OWERFLOW (open issue glibc).
Include de-duplicate information from elsewhere: hurd-paper, hurd-talk, trivialconfinementvsconstructorvsfork, zalloc panics (open issue glibc, open issue documentation).
We no longer support
MACH_IPC_COMPAT
, thus we can get rid of theerr = __mach_port_allocate_name ([...]); if (err == KERN_NAME_EXISTS)
code (open issue glibc).Can we/why can't we use the concept of inherited ports arrays/
mach_ports_register
(open issue glibc)?GNUnet
vfork
signal race issue:id:"[email protected]"
.
Related
External
D. J. Bernstein's self-pipe trick.
Richard Kettlewell's suggestions about how fork(2) ought to be.