Talk:Fork (system call)

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Low	This article has been rated as Low-importance on the project's importance scale.
	This article is supported by WikiProject Software (assessed as Low-importance).

Linux Low‑importance

	Linux portal This article is within the scope of WikiProject Linux, a collaborative effort to improve the coverage of Linux on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.LinuxWikipedia:WikiProject LinuxTemplate:WikiProject LinuxLinux articles
Low	This article has been rated as Low-importance on the project's importance scale.

Mythos[edit]

What about deleting that paragraph? I'm not sure whether the author was aware of the fact that he described the infamous "forkbomb" which has its own article and is already linked below. Further "mythos" is never actually explained. Also I doubt this is the right place for "honorable mentions" of fork() or whatever else. There would be no end if you did this for all articles. If at all, the article about the Matrix should link to fork but not vice-versa. Anyway, I remove it from the article. --82.141.49.144 03:55, 4 December 2005 (UTC)[reply]

While I agree with much of this, rather than saying mythos, I think that "geek culture" is probably a bit better, as the term has seeped into "pop" culture. McKay 00:07, 9 December 2005 (UTC)[reply]

Make code compileable[edit]

I can't compile the code. Someone could add the propriate header files. It invite readers to experiment with fork(). The line with the /* Note */ should be changed and the note should be removed. I have no clue what the author meant with that. --Bernard François 14:58, 15 January 2006 (UTC)[reply]

From what I can see this code only needs the unistd.h (or stdlib.h if that doesn't work) header file, sometimes it also needs an extra include such as /sys/type.h or /sys/types.h, but this is sytem dependent. Also, the code should be in a main function. I'm not going to change the code on the page because it is only sample code. Regarding the /* Note... */, this line uses _exit(), which is different from exit(), (it has no underscore). These things may be effectively the same (actually they probably are - an exit() works fine here), but you'd need to see the source code for each of them. --Pyrofysh 06:36, 25 April 2006 (UTC)[reply]

But why did the author use _exit() instead of exit() at all in the child? --84.188.202.21 14:31, 12 October 2006 (UTC)[reply]

The standard library does some clean-ups with library call exit() (flushes buffers, closes file descriptors). Also, any files created with tmpfile() are closed, and some other things may occur also that one would prefer not to happen until the parent exits. The system call _exit() unconditionally exits. In addition, had vfork() been used instead of fork(), very unexpected things can happen because the memory is shared, if exit() is called before an exec; but sensible people don't use vfork. Agarvin 20:03, 12 November 2006 (UTC)[reply]

SMP[edit]

Forks fully utilize SMP, right?

If you mean Symmetric MultiProcessing, this depends more on the operating system's implementation of process handling than on the fork call. All fork does in most implementations is make a copy of a current process. --Pyrofysh 06:41, 25 April 2006 (UTC)[reply]

Clean code[edit]

Why is there a declaration of "int i" at the start of the code when it is only used in one of the for loops? Why not declare int i & j at the start, and forget about "i" at the beginning? Consistency? Maybe I am missing something?

Nope, not missing anything - this is fixed. Steeltoe 05:13, 19 April 2006 (UTC)[reply]

On that note, some compilers will not compile the code unless all variables are intialised at some point in the function. Currently i is only initialised under the "else if(pid <0)" condition, and j is only intialised under the "if(pid==0)" condition. As neither of these variables will be initialised under all cases, those compilers will cause an error. Ideally both int i and j should be declared at the start, because c intialises ints on declaration (even if it sometimes is to nonsense).

I'm just putting this note here in case someone doesn't understand why their code doesn't compile - I think that the current code should remain unchanged regarding this because it is simpler having variables declared where they are used (in this case), additionally, most modern compilers don't suffer from this problem. --Pyrofysh 10:22, 7 June 2006 (UTC)[reply]

exit()[edit]

I've created an article for the inverse operation, exit. - Loadmaster 16:36, 18 August 2006 (UTC)[reply]

Fork is not critical to the Unix design philosophy[edit]

Forking is an important part of Unix, critical to the support of its design philosophy, which encourages the development of filters.

Concurrent processes are necessary for the development of filters. Whether they are implemented using fork, or some other primitive such as spawn, is surely an implementation detail? --DavidHopwood 18:26, 5 April 2007 (UTC)[reply]

Yet fork predates spawn by at least a decade or two. I'll wager that many Unix programmers don't even know there is a spawn function. Likewise, a huge amount of Unix code relies on fork/exec to operate. — Loadmaster 03:51, 20 June 2007 (UTC)[reply]

No, I don't think it is just an implementation detail. I believe Fork well conforms to some basic Unix principles, if not intently, at least coincidently and so be persevered by natural selection over time.

Among other principles, Unix has two important principles,

1. Each component should do one thing, and do one thing well.

2. Fundamental component (especially like kernel) should provide mechanism rather than policy.

Among the kernel's jobs, to create a new task and to decide what new job the new task should do are two separate things. Although most of newly-created task is to load another executable from the disk, but that's not always the case. A newly-created task may run the same code as its parent, with the same or different input, or it may load new executable code from network or other media than disk, or it may even compute new code by its own.

Therefore to implement a system call that creates a new task AND loads new executable code is a bad idea. If you insist to do so, you come across the second difficulty that is how you determine the way the new executable code is loaded. To enforce the new code always loaded from disk is crude. To provide several predefined ways is better, but still it enforces policy to users.

On the other hand, to create an empty task is also not practical. Can an empty task exist at all? I don't know. But how an empty task decide what it go next step. It can't. So either kernel or another task has to make a decision for it. Kernel should not provide policy so it shouldn't make such a decision. Another task doing so will violate process separation basically. To let the newly-created task duplicate its parent is a handy and elegant choice. It involves only memory operation (and by COW the memory opt is also minimized). It does not violate process separation and make the child-process have maximal flexibility to decide what next step to go. So fork() does one thing only and well and push as much policy as possible up to the user space. — Feng Dong 11:44, 28 May 2008 (+0800)

I don't really know how to reply in a wiki, sorry.

It may be nice to imagine that UNIX has some fundamental principles that it does not violate, it's sadly not true. There are examples with fork/exec itself.

FD_CLOEXEC, ancient unix programs would have a little for loop to close all fds from 3 to 255 in the child, but then it was realized that say a library may want to indicate some not to close, an example of exec doing more than one thing now.

alarm() and fork, old versions had the bug that the child could get the signal, so not only is fork also doing peculiar things about masking certain signals, it is now also canceling timers.

stdio did not mesh well with exit (say the exec failed), an exit was doing more than just terminating the process, so _exit was added.

Soon people realized that fork was expensive (even with COW) to just exec a new image shortly afterwards, so vfork was added, again doing less than fork.

There was signal, that was originally racey, so sigaction was added to do more.

wait, later waitpid and wait4, SIGCHLD, sigaltstack, termio differences, all the work into the STREAMS dead end, the list goes on about how things were extended to do more and less over the years as programmers discovered the need.

A spawn-like is needed in UNIX, simply vfork is pretty gross, old man pages even say it is not safe to do little more than _exit or exec from the child. There are now huge VM spaces, that is wasteful to mark COW to just tear down again during the exec. Moreover pretty much on anything now (even Linux has a tunable for this) there needs to be enough swap available for the fork to succeed. This is just another case of after time systems programmers seeing that there was something lacking.

It's sort of a mess though, look at the SUSv3 posix_spawn and the related interfaces that are related to posix_spawn_file_actions_init and posix_spawnattr_init. All of this just to deal with with doing some of the typical stuff a programmer might wish to do between a fork and exec. It is also poorly designed, say you are in a multi-threaded program and you close a fd that happens to get closed by another thread, the spawn fails with errno EBADF, you might as well use FD_CLOEXEC.

That sort of alludes to another issue about UNIX, there were some poor decisions that after 20 or so years made other things more complicated. Think about when pthreads were added and the semantics on addr lookup in BSD sockts, errno work arounds, signal issues, and fork itself (look at the mess of fork, fork1, and forkall in Solaris in particular). Actually something should be added about locked mutexes in the child and that now the standard specifies that fork only forks the calling thread in the child to the article itself, it is a detail like that stdio already in the article.

So no matter how much we wish it to be true, UNIX is not pretty, and systems programming is hairy and always will be. Yes fork is a do one simple thing concept, but the later vfork did even less, and then when you are running into not enough swap to fork that 8GB db process to run a shell script, you realize that something like system that takes exec like array args and does not do all the VM stuff was a sorely lacking feature. —Preceding unsigned comment added by 131.225.103.35 (talk) 22:33, 11 June 2009 (UTC)[reply]

Fork, IIRC, was done in response to the hairy spawn-equivalent task-starter in MULTICS. (I believe this is citable, though I don't have the citation.) The idea was that rather than create an N-argument task starter that was always in a state of flux as people came up with new OS features, that a simple fork call in conjunction with inheritance and optional overriding would be much simpler, especially when it came to creating pipelines. As it was. It has nothing whatsoever to do with the ability to create filters, shells, pipelines, etc., and the opening section of this wiki entry is misleading. Also, original Unix generally favored small over efficient due to the machines upon which it was developed. The downside of fork is that this simplicity is largely predicated on Unix's single-threaded process model, and once that went away we got into the current situation. ISC PB (talk) 00:28, 8 March 2011 (UTC)[reply]

Merge with Fork Paging[edit]

It has been suggested to merge Fork Paging with this article. Add your comments below.

Splitting of section[edit]

This page is good and helps me on how to program fork in C. But I would suggest first to put the discussion of more technical aspects like vfork in a separate page. Secondly, could some more examples in C be provided? What about communication between forked processes for example. —Preceding unsigned comment added by 161.53.128.59 (talk) 09:01, 21 August 2009 (UTC)[reply]

Does vFork() really fork ?[edit]

vFork seems to be some very broken implementation. I cannot fork with it, the Master always blocks for the child as soon as the child is started, diminishing any purpose of creating a fork at all. (no concurrency). I do not see it in the article mentioned ? Is that behaviour so "normal" that it does not even need mentioning? --89.247.43.241 (talk) 01:49, 21 November 2011 (UTC)[reply]

vfork is a rather special case of fork, and isn't applicable to all the uses where fork could be used. I believe the linux man-pages describe it rather well. Note that there exists some controversy to whether or not vfork should be in libc at all—just look for the heading Bugs on the page linked. Imho, and this is a rather subjective opinion, vfork has it's niche use cases but shouldn't be undertaken lightly by the novice programmer; without a proper understanding of the limitations and advantages it's likely to run up more debugging hours than it's worth.

--Knneth (talk) 21:02, 11 December 2011 (UTC)[reply]

Even on SunOS that introduced a Copy on Write fork() in 1987 (together with many other new address space features) vfork() is still 3x faster than fork(). For this reason, vfork() is of course kept in the OS, even though it has some spacial constraints. vfork() helps to increase the performance of shells, even though there are few shells that support vfork(): csh (since ~ 1978), bsh (since 1986), ksh93 (since ~ 2000), Bourne Shell (since 2014). For the performance improvements for ksh93, you need to ask David Korn, for the Bourne Shell I did the conversion and a typical "configure" run results in aprox. 30% better performance with a Bourne Shell that uses vfork() only for 1/3 of the fork() calls. Schily (talk) 12:46, 23 October 2015 (UTC)[reply]

Copyright problem removed[edit]

Prior content in this article duplicated one or more previously published sources. Copied or closely paraphrased material has been rewritten or removed and must not be restored, unless it is duly released under a compatible license. (For more information, please see "using copyrighted works from others" if you are not the copyright holder of this material, or "donating copyrighted materials" if you are.) For legal reasons, we cannot accept copyrighted text or images borrowed from other web sites or published material; such additions will be deleted. Contributors may use copyrighted publications as a source of information, but not as a source of sentences or phrases. Accordingly, the material may be rewritten, but only if it does not infringe on the copyright of the original or plagiarize from that source. Please see our guideline on non-free text for how to properly implement limited quotations of copyrighted text. Wikipedia takes copyright violations very seriously, and persistent violators will be blocked from editing. While we appreciate contributions, we must require all contributors to understand and comply with these policies. Thank you. Osiris (talk) 07:02, 7 July 2012 (UTC)[reply]

Fork only duplicates the calling thread[edit]

At least with POSIX threads, forking a process creates a process with a single thread, i.e. only the thread that called fork(2) is duplicated, see e.g. Is it safe to fork from within a thread? or Chapter 6 in David R. Butenhof's book "Programming with POSIX Threads". Should the article mention this or would this be out of scope? — Tobias Bergemann (talk) 13:39, 8 August 2013 (UTC)[reply]

Importance of forking in Unix[edit]

The section "Importance of forking in Unix" claims (without a source) that

Forking is an important part of Unix, critical to the support of its design philosophy

I don't think this is true. The ability to run multiple programs is critical to the Unix philosophy, and for that matter to any multiprocessing operating system. The section does have a point when it describes the combination of forks and pipes, which is quite simple in Unix but then again, (1) fork is older than pipe (V1 vs. V3 Research Unix, IIRC) and (2) the pipelining mechanism could have been implemented in a different way, e.g. by a combined fork/exec system call that can also set file descriptors. Maybe that would violate worse is better, but I see nothing particularly important about fork for the support of shell pipelines. QVVERTYVS (hm?) 14:08, 16 October 2013 (UTC)[reply]

Origin of vfork()[edit]

It is mentioned that vfork() was first appeared in 3BSD, but FreeBSD's manual page claims that vfork() was first appeared in 2.9BSD. Stevens and Rago in the Advanced Programming in the Unix Environment also claim that vfork() was originated from 2.9BSD (p. 234). I think this conflict arises from that the vfork() system call was removed in 4.4BSD (according to Stevens and Rago), but later systems added it back (as mentioned in the Linux manual page). -- Bkouhi (talk) 02:35, 14 March 2015 (UTC)[reply]

Seems like they're wrong. 3BSD has a vfork(2) manual page, and 3BSD predates 2.9BSD by ~~four~~ three years. (The 2.9 series was continued into the 1990s, though not by Berkeley; 2.9 was a 4.1cBSD backported to the PDP-11.) The NetBSD people know this too. QVVERTYVS (hm?) 11:39, 14 March 2015 (UTC)[reply]

Wrong text for vfork() in the article[edit]

Quoting Bach does not seem to be a good idea for giving citations for the relevance of vfork(). Bach did never really use vfork()...

You should rather ask people who implemented vfork() in a shell and did performance analysis on the results. So rather ask Bill Joy, David Korn or me. vfork() is still 3x faster than the COW based fork() introduced by SunOS-4.0 that is now available in SVr4 as well. Schily (talk) 12:51, 23 October 2015 (UTC)[reply]

fork in other operating systems[edit]

According to project genie fork originated there. The two articles should be reconciled. (I do not understand why the preview shows project genie as a missing article.)

"The fork mechanism (1969) in Unix and Linux maintains implicit assumptions on the underlying hardware". This sentence is self-contradictory. The listed assumptions did not hold for the hardware on which Unix ran in 1969 and some years after. Mdmi (talk) 06:10, 3 January 2016 (UTC)[reply]

It's spelled Project Genie; article titles are case-sensitive except that the first character is capitalized automatically. You're right about the hardware requirements: paging (VM) Unix post-dates fork by a decade. Removed that. QVVERTYVS (hm?) 15:38, 3 January 2016 (UTC)[reply]

posix_spawn() based on fork is slow[edit]

Just for your information: Linux is not a POSIX OS and what Linux does is not a gauge for POSIX OS.

A fast posix_spawn() needs vfork() and that a posix_spawn() implemented on top of fork() is a slow and useless restriction.

If you like to do a decent comparison of fork() and vfork(), you need to check an OS that implements a modern fork() and a vfork(). Solaris implements both fork variants:

the new copy on write fork() call, that was invented by SunOS-4.0 in late 1987.

the traditional vfork() that shares MMU descriptors instead of sharing memory in the fork() above.

Vfork() is still 2.6x faster than the new copy-on-write fork() for the recent version of the Bourne Shell (160kB size) on my home machine. This causes a noticeable speed-up for "configure" compared to an old non-vfork() aware Bourne Shell.

Note that there are currently 4 known shells that support vfork:

csh - the first - vfork was written for csh before 1980

my bsh uses vfork since 1985

ksh93 uses vfork since ~ 1993

The recent Bourne Shell uses vfork since 2014

The shells using vfork() are noticeable faster than other shells. Schily (talk) 13:47, 5 January 2016 (UTC)[reply]

Who says we're restricting our attention to POSIX-compliant OS's? The source finds Linux important enough to mention it; the Solaris source code is not enough to establish one implementation strategy as "typical". QVVERTYVS (hm?) 14:05, 5 January 2016 (UTC)[reply]

Didn't you read the article? It mentions that vfork() was removed from POSIX. BTW: You are welcome to present source code for other POSIX certified implementations (e.g. AIX). Schily (talk) 14:08, 5 January 2016 (UTC)[reply]

double fork[edit]

I think double fork is a notion that should be mentioned. The attempt I am aware of is at line 110 at https://en.wikipedia.org/w/index.php?title=Fork_%28system_call%29&action=historysubmit&type=revision&diff=748268941&oldid=748250332

— Preceding unsigned comment added by 188.120.133.176 (talk) 14:47, 7 November 2016 (UTC)[reply]

Should add reference to Microsoft's "A fork in the road" paper[edit]

https://dl.acm.org/doi/abs/10.1145/3317550.3321435 — Preceding unsigned comment added by 66.25.27.1 (talk) 03:55, 28 June 2022 (UTC)[reply]

Perhaps - see also this - (provided that it doesn't devolve into yet another mess. There are a few related sources (see this), but so far no third-party sources which contrast those. TEDickey (talk) 07:58, 28 June 2022 (UTC)[reply]

unfortunate, etc[edit]

There's a quote from version 3.something of the Linux kernel manpages, which isn't in release 5. Actually (see change) that was removed silently in 2012. The link could be repaired, e.g., with this, or trimmed. TEDickey (talk) 22:45, 25 November 2019 (UTC)[reply]