php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79660 Shared memory is dirty in child proc after pcntl_fork
Submitted: 2020-06-01 14:17 UTC Modified: 2020-06-09 10:49 UTC
From: phpbugs at muglug dot com Assigned:
Status: Not a bug Package: Performance problem
PHP Version: 7.4.6 OS: Linux
Private report: No CVE-ID: None
 [2020-06-01 14:17 UTC] phpbugs at muglug dot com
Description:
------------
I have a cross-platform command-line application that uses pcntl_fork to spawn child processes.

pcntl_fork's implementation is very simple – it's just a call to fork(), which has copy-on-write behaviour (no memory should be duplicated unless the program modifies it).

On Macs, memory appears to be shared effectively – it's only dirtied if there is some change made to the data.

When the same script is run on Linux it appears that the shared memory is dirtied instantly, which increases the cost of pcntl_fork both in terms of memory (whicch is multiplied by the number of calls to that function) and also time, as each call to pcntl_fork is more costly.

Test script:
---------------
<?php

$a = [];

for ($i = 0; $i < 1000000; $i++) {
    $a[] = "$i";
}

$pid = pcntl_fork();
if ($pid == -1) {
     die('could not fork');
} else if ($pid) {
     // we are the parent
     pcntl_wait($status); //Protect against Zombie children
} else {
     echo shell_exec('cat /proc/' . posix_getpid() . '/smaps | grep Shared_ | grep -v "0 kB"');
     echo $a[0];
}

Expected result:
----------------
There should be no large areas of Shared_Dirty memory in the output, since most memory should be shared.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-06-08 20:11 UTC] alexinbeijing at gmail dot com
Thanks for raising this interesting question. In short, it appears you are misinterpreting what "Shared_Dirty" memory areas actually mean in Linux.

Try this: Do the same check on /proc/<PID>/smaps *before* forking. You will find that the "Shared_Dirty" regions are just the same both before and after the fork.

So there is nothing in the child which is writing to all those COW'd pages. They were "dirty" before the fork (meaning they have been written to, and if they were mmap'd from a file, the changes would need to be flushed out to disk). They are still "dirty" after the fork. That is not a problem.

The fact that the memory regions are "Shared_Dirty" is actually *good* for you. It means that the memory is... well... shared! If all the pages had been written in the child, causing the OS to copy them, they would display as either "Private_Clean" or "Private_Dirty", not "shared". "Shared" is what you want.

So this is not actually a bug. Thanks for submitting your concern, though. It was interesting to investigate.
 [2020-06-08 22:12 UTC] phpbugs at muglug dot com
-Status: Open +Status: Closed
 [2020-06-08 22:12 UTC] phpbugs at muglug dot com
Thanks for digging into this.

I think I'm just confused by the discrepancy between the way Resident Set Size is reported on MacOS and Linux when calling `ps` – in the former, it excludes shared memory, in the latter shared memory is apparently included.
 [2020-06-09 10:49 UTC] cmb@php.net
-Status: Closed +Status: Not a bug
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Mon Aug 03 21:01:24 2020 UTC