|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #70984 Script extreme slow compared to 5.6, MAP_HUGETLB problem?
Submitted: 2015-11-27 12:54 UTC Modified: 2016-04-27 16:48 UTC
Avg. Score:3.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: arjen at react dot com Assigned: jpauli (profile)
Status: Closed Package: Scripting Engine problem
PHP Version: 7.0.0RC8 OS: Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
Solve the problem:
39 - 37 = ?
Subscribe to this entry?

 [2015-11-27 12:54 UTC] arjen at react dot com
Testcase runs in 6.7 sec.
1.7 sec in PHP 5.6.

strace reports lots of ENOMEM (Cannot allocate memory) warnings:

munmap(0x7fd131400000, 2097152)         = 0
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd131400000
madvise(0x7fd131400000, 2097152, MADV_HUGEPAGE) = 0
munmap(0x7fd131400000, 2097152)         = 0
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd131400000
madvise(0x7fd131400000, 2097152, MADV_HUGEPAGE) = 0

Ultimately the memory allocation succeeds. With bigger javascript inputs (it's a javascript minifier), running time can be in 10-60 minutes, compared to 1-5 minutes under 5.6.

The PHP binary is compiled on a host with HUGETLB support, however the target host (systemd-container) has not Hugepages (configured). 5.6 testing was done with same memory amount. There is enough (normal) free memory.

cat /proc/meminfo
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0

Test script:

Expected result:
Execution at least as fast as 5.6.

Actual result:
3-4x slower compared to 5.6.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2015-11-27 21:40 UTC]
-Status: Open +Status: Analyzed
 [2015-11-27 21:40 UTC]
Just to verify that it is due to the lack of huge pages, can you configure some and re-run your test. Something like:

    sysctl -w vm.nr_hugepages=256
 [2015-11-29 10:08 UTC] sjon at hortensius dot net
I tried testing this, this specific (bare-metal/archlinux-4.2.5) machine refused to increase hugepages until I dropped caches to free some memory. After I did this, the speed immediately increased even without increasing the actual hugepages setting. On this machine, I have the following stats before dropping caches:

 Performance counter stats for '/srv/http/ ./hugepages.php':

      39181.707776      task-clock (msec)         #    0.997 CPUs utilized          
               132      context-switches          #    0.003 K/sec                  
                 1      cpu-migrations            #    0.000 K/sec                  
            23,446      page-faults               #    0.598 K/sec                  
   133,246,942,916      cycles                    #    3.401 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
   131,441,748,754      instructions              #    0.99  insns per cycle        
    35,752,523,403      branches                  #  912.480 M/sec                  
       273,889,941      branch-misses             #    0.77% of all branches        

      39.303890805 seconds time elapsed


 Performance counter stats for '/srv/http/ ./hugepages.php':

       1511.186576      task-clock (msec)         #    0.931 CPUs utilized          
               165      context-switches          #    0.109 K/sec                  
                13      cpu-migrations            #    0.009 K/sec                  
             8,528      page-faults               #    0.006 M/sec                  
     4,878,265,167      cycles                    #    3.228 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
     4,934,451,314      instructions              #    1.01  insns per cycle        
     1,079,838,152      branches                  #  714.563 M/sec                  
        22,318,535      branch-misses             #    2.07% of all branches        

       1.623016972 seconds time elapsed

**************************** After drop_caches: ****************************

 Performance counter stats for '/srv/http/ ./hugepages.php':

        698.424928      task-clock (msec)         #    0.711 CPUs utilized          
               124      context-switches          #    0.178 K/sec                  
                21      cpu-migrations            #    0.030 K/sec                  
             2,303      page-faults               #    0.003 M/sec                  
     2,221,973,568      cycles                    #    3.181 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
     1,864,390,002      instructions              #    0.84  insns per cycle        
       447,618,521      branches                  #  640.897 M/sec                  
         9,212,550      branch-misses             #    2.06% of all branches        

       0.982939191 seconds time elapsed


 Performance counter stats for '/srv/http/ ./hugepages.php':

       1004.733041      task-clock (msec)         #    0.579 CPUs utilized          
               117      context-switches          #    0.116 K/sec                  
                 6      cpu-migrations            #    0.006 K/sec                  
             5,651      page-faults               #    0.006 M/sec                  
     3,191,485,063      cycles                    #    3.176 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
     3,684,100,051      instructions              #    1.15  insns per cycle        
       762,235,655      branches                  #  758.645 M/sec                  
        17,702,881      branch-misses             #    2.32% of all branches        

       1.736382087 seconds time elapsed
 [2015-11-29 17:01 UTC] sjon at hortensius dot net
Not sure if it'll help; but here are the first few lines from perf-report:

  33.41%  php-7.0.0RC8  [kernel.vmlinux]     [k] pageblock_pfn_to_page
  31.77%  php-7.0.0RC8  [kernel.vmlinux]     [k] isolate_freepages_block
   3.78%  php-7.0.0RC8  [kernel.vmlinux]     [k] get_pfnblock_flags_mask
   2.96%  php-7.0.0RC8         [.] __memcpy_avx_unaligned
   2.51%  php-7.0.0RC8  [kernel.vmlinux]     [k] memcmp
   2.23%  php-7.0.0RC8  [kernel.vmlinux]     [k] compaction_alloc

valgrind-cachegrind tells me most instruction cost (56%) goes towards __memcpy_avx_unaligned, then 34% into unknown and the rest is < 4% each.

Strace shows the same as reported. If you are unable to reproduce or need anything else, I'm happy to help!
 [2015-12-08 09:03 UTC]
My system is Fedora 22 with 32GB memory. 'sysctl -a | grep huge' showed only 1 huge page. PHP 7 was about 100% or more slower than PHP 5.6 with the test script. 

I set 
in /etc/sysctl.conf and rebooted the system as it seemed rebooting is required.

PHP 7.0 (debug build)
$ time ./php-bin t.php

real	 0m0.938s
user	 0m0.910s
sys	 0m0.027s

PHP 5.6 (Fedora22)
$ time php t.php

real	0m1.099s
user	0m0.882s
sys	0m0.226s

It appears vm.nr_hugepages=512 helped a lot. Thanks for the tip, Rasmus.

vm.nr_hugepages=0 may help, but I didn't test this. (yet)
 [2015-12-08 09:15 UTC]
It appears vm.nr_hugepages=0 helped also.

PHP 7.0 debug build
[yohgaki@dev PHP-7.0]$ time ./php-bin t.php

real	0m0.839s
user	0m0.812s
sys	0m0.026s
[yohgaki@dev PHP-7.0]$ time ./php-bin t.php

real	0m0.847s
user	0m0.819s
sys	0m0.027s

PHP 5.6 Fedora 22 rpm package
[yohgaki@dev PHP-7.0]$ time php t.php

real	0m1.166s
user	0m0.805s
sys	0m0.195s
[yohgaki@dev PHP-7.0]$ time php t.php

real	0m1.012s
user	0m0.816s
sys	0m0.205s
 [2015-12-08 09:20 UTC] sjon at hortensius dot net
Adding vm.nr_hugepages will claim a few pages as huge; making the allocation work. @yohgaki: it will probably be the reboot that 'fixes' this for you, like I already posted.

However, PHP should fail better when no hugepages can be allocated. Imo the allocator should store a HUGETLB failure for x number of runs / time instead of keep trying (which seems to have a significant impact on performance).

Increasing nr_hugepages is a workaround; and potentially claims memory that can not be used by applications not using HUGETLB allocations.
 [2015-12-08 09:28 UTC]
Huge page setting affects more for 7.0.
I observed extremely slow execution (60 secs or more) on occasion when HugePages_Total is 1. The reporter's system uses no huge page and experiences slow execution. My Fedora 22 seems performing better without huge page.

It seems we are better to document huge page setting some where in the manual.
 [2015-12-08 09:39 UTC]

Your guess seems right. I rebooted my system with 1 huge page and I only get slower PHP 7.0 execution only at the first time.

I should have gotten strace output when my system was performing poorly.

Anyway, PHP may handle mmap error more gracefully.
 [2015-12-08 09:40 UTC] sjon at hortensius dot net
Yes, this is all obvious. This behavior is caused by which I think might be a good change as hugepages CAN result in a performance increase; however it seems to cause more issues that it's worth currently (as misses are too costly).

Documenting nr_hugepages isn't going to fix the significant slowdowns that this feature currently causes
 [2015-12-08 14:49 UTC]
But why compile in this optional feature if you are not going to use it?
I would assume that general-purpose distro builds are not going to compile in this option.
 [2015-12-08 15:29 UTC] arjen at react dot com
Optional feature?

It's optional for the opcache (see;a=commitdiff;h=669f0b39b184593e01e677360fd79b2b63058ca0 can be enabled with --enable-huge-code-pages "PHP should be configured and built with --enable-huge-code-pages, OS should be configured to provide huge pages.")

However, the MAP_HUGETLB usage in cannot be disabled. And if the OS isn't configured to provide huge pages (but MAP_HUGETBL is stil defined), every mmap call with MAP_HUGETBL apparantly fails.
 [2016-03-18 08:41 UTC] arjen at react dot com
Added ability to disable huge pages in Zend Memeory Manager through t…
…he environment variable USE_ZEND_ALLOC_HUGE_PAGES=0.

So like I said, it's not optional. At least it can be disabled now. But only on trunk, not 7.0?
 [2016-03-22 12:21 UTC]
-Status: Analyzed +Status: Feedback
 [2016-03-22 12:21 UTC]
Please try using this snapshot:
For Windows:

The commit has been merged to 7.0 and should be part of PHP 7.0.5
 [2016-04-03 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 [2016-04-27 16:48 UTC]
-Status: No Feedback +Status: Closed -Assigned To: +Assigned To: jpauli
 [2016-04-27 16:48 UTC]
Closing as a fix has been commited to 7.0.5 and now huge pages are disabled by default, thus solving any related problem.

Huge pages may be activated for Zend Memory Manager, using USE_ZEND_ALLOC_HUGE_PAGES=1 as env variable.
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Tue Oct 20 06:01:26 2020 UTC