php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #57255 Locking performance under SMP
Submitted: 2006-09-22 06:47 UTC Modified: 2006-09-25 08:05 UTC
From: oliver at realtsp dot com Assigned:
Status: Not a bug Package: APC (PECL)
PHP Version: 5.1.6 OS: FreeBSD 6.1
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: oliver at realtsp dot com
New email:
PHP Version: OS:

 

 [2006-09-22 06:47 UTC] oliver at realtsp dot com
Description:
------------
We have just switched from eA to APC. Previously we had 
segfault problems with APC on our, but it seems much 
better now, great work! Both exhibit the same problem.

We are caching around 400-500 files on a 2 x Dual core 
Opteron machine (ie 4 cores, 64Bit). 

We have 128MB of cache configured, about 40-50MB is needed 
for the 400-500 files. When the load is small loading the 
files from the cache takes 100ms. When the number of pages 
per second increases the loading of the class files 
quickly becomes the bottleneck and takes 1-2s. 

We have discussed this problems with the devs over at eA 
and it appears that eA obtains RW locks on the relevant 
memory block for each cache request. When 4 CPUs are 
trying to do this at 10 pages/sec for 400 files per page, 
you quickly get into a runaway lock situation.

Question:
a) Does APC have a similar locking stratgy (ie RW) as eA? 
It seems that maybe it does, since even when non of the 
files are being changes (ie a RW lock seems unnecessary, 
when a read-only lock may have sufficed) the runnaway lock 
situation quickly occurs.

b) do you experience with testing APC on SMP systems and 
if so do you have any advice on how to make APC scale to 
this kind of hardware.

The only way we are managing to run our application 
successfully with this many files/cpus is by using "load 
on demand", ie autoload..etc. We are also experimenting 
with a "package file system" where many files are grouped 
together into one larger one. However this presents quite 
a few problems too, not least the fact that debugging line 
numbers are no longer correct.

Thanks in advance for your help

Oliver




Reproduce code:
---------------
40-50MB is needed for the 400-500 files

Expected result:
----------------
no significant locking delays with 4 CPUS

Actual result:
--------------
very significant locking slow down

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-09-22 07:56 UTC] gopalv82 at yahoo dot com
APC locks the cache for a fetch scenario for a minimal period of time. Essentially, the memory copy section inside the cache fetch is not locked - only the cache find and refcount increment is inside the lock.

Using __autoload and require_once basically destroys any advantage apc is likely to bring you. Mainly because you move the compilation sequence into mid-runtime land, where the engine stops execution and proceeds to compile stuff. 

So please be clearer about what you mean when you say :-
""" 
  When 4 CPUs are trying to do this at 10 pages/sec for 
  400 files per page, you quickly get into a runaway 
  lock situation.
"""

I am working on a lockless per-process fast cache. But it is an optimization I'd rather do very slowly rather than rush in a patch for, having made a couple of false starts. 

All said and done, I think most of your performance hit will come from the large number of files. But if you don't mind running an instrumented version of APC, I can probably hack one up for you to pinpoint the bottleneck.

(and yes, I do most of my testing on a quad-cpu fbsd4.10, though not with 400 files per page).
 [2006-09-25 08:05 UTC] oliver at realtsp dot com
ok, I have done quite a lot of investigating on this 
subject and I have to admit that I was on the wrong track. 
The thing that doesn't scale as traffic increases is 
*_once.

It seems that *_once gives caching benefit when the load 
is low, but the fstat and fopen calls send system load 
through the roof when pg/sec increases particularly with 
multiple CPUs. 

We have investigated 3 strategies without using 
_once: "Single filing", "long list of requires in the 
right order" and "autoload". Our findings are here:
http://propel.tigris.org/servlets/ReadMsg?list=dev&msgNo=1782

Thanks for your input.

Going to close this bug.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 01:01:28 2024 UTC