Does fork() immediately copy the entire process heap in Linux?

A fork() system call clones a child process from the running process. The two processes are identical except for their PID.

Naturally, if the processes are just reading from their heaps rather than writing to it, copying the heap would be a huge waste of memory.

Is the entire process heap copied? Is it optimized in a way that only writing triggers a heap copy?

Contents hide

Answers:

Method 1

Method 2

Method 3

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The Linux kernel does implement Copy-on-Write when fork() is called. When the syscall is executed, the pages that the parent and child share are marked read-only.

If a write is performed on the read-only page, it is then copied, as the memory is no longer identical between the two processes. Therefore, if only read-operations are being performed, the pages will not be copied at all.

Method 2

The entirety of fork() is implemented using mmap / copy on write.

This not only affects the heap, but also shared libraries, stack, BSS areas.

Which, incidentally, means that fork is a extremely lightweight operation, until the resulting 2 processes (parent and child) actually start writing to memory ranges. This feature is a major contributor to the lethality of fork-bombs – you end up with way too many processes before kernel gets overloaded with page replication and differentiation.

You’ll be hard-pressed to find in a modern OS an example of an operation where kernel performs a hard copy (device drivers being the exception) – it’s just far, far easier and more efficient to employ VM functionality.

Even execve() is essentially “please mmap the binary / ld.so / whatnot, followed by execute” – and the VM handles the actual loading of the process to RAM and execution. Local uninitialized variables end up being mmaped from a ‘zero-page’ – special read-only copy-on-write page containing zeroes, local initialized variables end up being mmaped (copy-on-write, again) from the binary file itself, etc.

Method 3

Linux does Copy-on-Write. As fork creates a new process, the allocated pages are marked as readonly and shared between the parent and child. When either of them tries to modify a page, a page fault is generated resulting in copying the page and adjusting the page table appropriately.

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating