|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #53310 fpm_atomic.h uses SPARC v9 only code, doesn't work on v8
Submitted: 2010-11-15 00:21 UTC Modified: 2010-11-24 02:02 UTC
Avg. Score:4.7 ± 0.5
Reproduced:3 of 3 (100.0%)
Same Version:1 (33.3%)
Same OS:1 (33.3%)
From: stefan at whocares dot de Assigned: fat (profile)
Status: Wont fix Package: FPM related
PHP Version: 5.3.3 OS: Linux (Debian for Sparc)
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2010-11-15 00:21 UTC] stefan at whocares dot de
Compiling with PHP-FPM enabled on an older SPARC system will result in 

/tmp/cc6w5Fh0.s: Assembler messages:
/tmp/cc6w5Fh0.s:39: Error: Architecture mismatch on "cas".
/tmp/cc6w5Fh0.s:39:  (Requires v9|v9a|v9b; requested architecture is sparclite.)

Unfortunately my knowledge of SPARC assembly language isn't nearly good enough to fix that. I know that the v9 "cas" opcode does an atomic "compare and swap" operation but I wouldn't know how to translate that into v8 code. 

Test script:
Copy /sapi/fpm/fpm/fpm_atomic.h to fpm_atomic.c and add bogus main() function:

int main () {
	int result;
	atomic_t mylock;
	result = fpm_spinlock(&mylock, 1);

Compile using "gcc -mcpu=v8 fpm_atomic.c" will result in error message given.

Expected result:
Should compile without error.

Actual result:
sparky:~# gcc -mcpu=v8 fpm_atomic.c
/tmp/cciAbMrC.s: Assembler messages:
/tmp/cciAbMrC.s:121: Error: Architecture mismatch on "cas".
/tmp/cciAbMrC.s:121:  (Requires v9|v9a|v9b; requested architecture is sparclite.)


fpm_atomics_sparc_v8.patch (last revision 2010-11-15 02:04 UTC by stefan at whocares dot de)

Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2010-11-15 00:53 UTC]
-Status: Open +Status: Analyzed
 [2010-11-15 00:53 UTC]
As the sparc documentation says 
The SPARC v9 manual introduced the newest atomic instruction: compare and swap 

I don't know how to fix this right now. If you know someone who can, he's 
welcome. I've already asked for help.

wait and see
 [2010-11-15 03:05 UTC] stefan at whocares dot de
Well, I blatantly copied from PostgreSQL's s_lock.h and came up with this:

diff -Nau fpm_atomic.h 
---	2009-12-14 09:18:53.000000000 +0000
+++ fpm_atomic.h	2010-11-15 01:50:31.000000000 +0000
@@ -82,7 +82,7 @@
 #endif /* defined (__GNUC__) &&... */
 #elif ( __sparc__ || __sparc ) /* Marcin Ochab */
+#if (__sparc_v9__)
 #if (__arch64__ || __arch64)
 typedef uint64_t                    atomic_uint_t;
 typedef volatile atomic_uint_t      atomic_t;
@@ -118,7 +118,23 @@
 /* }}} */
+#else /* sparcv9 */
+typedef uint32_t                    atomic_uint_t;
+typedef volatile atomic_uint_t      atomic_t;
+static inline int atomic_cas_32(atomic_t *lock) /* {{{ */
+	        register atomic_uint_t _res;
+		        __asm__ __volatile__("ldstub [%2], %0" : "=r"(_res), 
"+m"(*lock) : "r"(lock) : "memory");
+			        return (int) _res;
+/* }}} */
+static inline atomic_uint_t atomic_cmp_set(atomic_t *lock, atomic_uint_t old, 
atomic_uint_t set) /* {{{ */
+	        return (atomic_cas_32(lock)==0);
+/* }}} */
 #error unsupported processor. please write a patch and send it to me

If I'm reading the original code correctly, there's no actual locking done but 
instead the code only tests whether it could acquire a lock. 'ldstub' works such 
that it returns the current value of the memory region specified and sets it to 
all '1' afterwards. Thus, if the return value is '-1' the lock was already set 
by another process whereas if it's '0' we acquired the lock. Well, at least in 
my certainly flawed logic ;) Since ldstub is atomic I didn't see a need to 
explicitly "lock;" the code.

The patch should leave the 'cas' code intact when being compiled on v9 type 
SPARC systems. Tested (for successful compilation only!) on Debian (etch) using 
gcc 3.3.5. Thus I believe further testing is necessary to verify this is 
actually working.

Well, please test and incorporate if you feel the code is doing what it's 
supposed to do.
 [2010-11-15 09:09 UTC]
-Assigned To: +Assigned To: fat
 [2010-11-16 19:48 UTC] sriram dot natarajan at gmail dot com
May I know as to why you need to compile with v8 ? compiling with v9 does not 
automatically make your application 64-bit . if that is the reason you want to 
choose -v8 in here.

v8 sparc instruction is decade old - and is not 
being used in any hardware. so, i see no reason as to why we need to use / support 
this specific instruction set.
 [2010-11-16 23:02 UTC]
Automatic comment from SVN on behalf of fat
Log: - Fixed #53310 (sparc < v9 won't is not supported)
 [2010-11-16 23:05 UTC]
-Status: Analyzed +Status: Wont fix
 [2010-11-16 23:05 UTC]
we've decided sparc < v9 won't be supported. I've just updated the source code to 
warn specificaly about this.
 [2010-11-16 23:52 UTC] stefan at whocares dot de
Of course you may ask: because I'm porting PHP to the ReadyNAS platform which 
happens to use a SPARC v8 compatible CPU and thus *needs* the v8 instruction set.
Seeing that you've already made up your mind though, so I guess there's nothing 
more to add here. Makes me wonder why I can't get a response in > 24 hours as to 
my patch but you can't wait for me to answer for like 4 hours.
 [2010-11-17 00:03 UTC]
you should be able to compile with a gcc version which provides the 
__sync_bool_compare_and_swap builtin function (>= 4.1).

It's supported by FPM. If with this version of GCC FPM is not able to be compiled, 
there is a bug in FPM. We'll take care of it.

It this a reasonable solution ?
 [2010-11-17 00:22 UTC]
And you can still use FastCGI, btw. FPM is fairly new, and if new SAPIs have to support soon to be dead OSes, then we will cruelly need more developers to maintain everything :)
 [2010-11-17 00:24 UTC] stefan at whocares dot de
As you may have read in my initial post, the compiler I (have to) use is gcc 
3.3.5 which falls a bit short of 4.1 ;) Also, you may want to read the 
backend/port/tas/solaris_sparc.s file from the official PostgreSQL sources:

        ! "cas" only works on sparcv9 and sparcv8plus chips, and
        ! requies a compiler targeting these CPUs.  It will fail
        ! on a compiler targeting sparcv8, and of course will not
        ! be understood by a sparcv8 CPU.  gcc continues to use
        ! "ldstub" because it targets sparcv7.

There they work around this by using a condition (for the SUN compiler) like 

#if defined(__sparcv9) || defined(__sparcv8plus)
        cas     [%o0],%o2,%o1
        ldstub [%o0],%o1

and in their actual generic lock implementation (src/include/storage/s_lock.h) 
the code is this:

#if defined(__sparc__)          /* Sparc */

typedef unsigned char slock_t;

#define TAS(lock) tas(lock)

static __inline__ int
tas(volatile slock_t *lock)
        register slock_t _res;

         *      See comment in /pg/backend/port/tas/solaris_sparc.s for why this
         *      uses "ldstub", and that file uses "cas".  gcc currently 
         *      sparcv7-targeted binaries, so "cas" use isn't possible.
        __asm__ __volatile__(
                "       ldstub  [%2], %0        \n"
:               "=r"(_res), "+m"(*lock)
:               "r"(lock)
:               "memory");
        return (int) _res;

#endif   /* __sparc__ */

Now my general idea was that if there's a reason for PostgreSQL to keep that 
code around, there might be a reason for PHP to do so as well. Obviously I was 
wrong there.

I also do not see the real advantage of 'cas' over 'ldstub' in the current 
scenario since both are atomic, both are supported (ldstub even on v7) and both 
do the job perfectly well.
 [2010-11-17 01:09 UTC] stefan at whocares dot de
Did it ever occur to you that I found this bug/problem because I *specifically 
wanted to use FPM* in the first place? Had I wanted to use FastCGI I'd have 
certainly done so. Seeing that there already was a solution for Sparc v9 I 
thought there might be interest in a solution that would allow PHP to run on 
older machines.

Hardware that maybe you're laughing about. But hardware that's still in fairly 
wide use. And, come to think of it, hardware that may also be the only hardware 
people in poorer countries than the one you're obviously living in are able to 
get their hands on. So I first asked and then got my hands dirty and even 
provided a possible solution - and one that could be easily implemented, too.

And what for? Only to get ignorance and witty remarks in return. Well, I almost 
forgot that the PHP project has such a bad reputation when it comes to bugs and 
patches. Thanks for reminding me why. Now you can safely go back to your ivory 
tower and think about supporting next decade's hardware only. For my part, I 
promise to keep any bugs/problems in PHP I may find in the future to myself and 
will do the same for any patches I may come up with.

Btw: the boxes I'm talking about are running Linux (which you could have seen by 
looking at the "OS:" tag) and I really have no idea why you'd call that a "soon 
to be dead OS". If you have a problem understanding the difference between a CPU 
and an OS, may I ask what exactly makes you think you can give some valuable 
input here? As for the "cruelly needed developers" you mention: I don't see why 
you should need those as long as the community comes up with patches you could 
use. Ok, if you keep driving away people like this, I start to have an idea as 
to why ;)
 [2010-11-17 01:15 UTC]
It was not badly meant, only trying to show you alternative. 

I can't know nor judge the reason why you need v8 support, but have been there many times in the past for my numerous projects.

We have to make decisions about which platforms we can support, and also which we stop to support. There is nothing personal or aggressive in our replies, only trying to explain the status and the reasoning behind it.

Sorry if you took it so badly, that's not the aim of our comments, or mines in particular.
 [2010-11-17 01:16 UTC]
and I was wiling to write arch, not OS....
 [2010-11-17 01:18 UTC]
@stefan at whocares dot de
Did you run your patch on a ReadyNAS box ? If you test it and tell us it works, 
there is not reason not to integrate it. As far as I know, it's not been tested 
but for compilation only.

We don't want to leave someone behind, but as pierre told you there is priorities. 
We'll be glad if you help us.
 [2010-11-17 02:19 UTC] stefan at whocares dot de
First of all: thanks for not taking my rant badly :)

Of course I can run this code and, well "test" it. I would have been happier 
however if someone besides me had looked over the code and said "yes, that looks 
like it could work" ;)

Right now it *is* running on two ReadyNAS (Sparc) boxes as well as on my SunFire 
280R. It doesn't segfault which to me is a good sign and it's producing normale 
output from the small test scripts I have run. Haven't done extensive testing so 
far but will try running Wordpress and Drupal in the next couple of days. If 
there's any special test you'd like to see me run against the patched version of 
PHP, let me know.
 [2010-11-17 08:27 UTC]
one simple test is to make php core the less as possible. You can create a file 
test.php wich does nothing but an "echo". 

Then you stress this page with FPM with ab (ab -c 100 -n 10000 

While the test is running you check the status page and see how it's goin' on.

it should be a good primary test.
 [2010-11-18 01:53 UTC] sriram dot natarajan at gmail dot com
there is difference between these 2 instructions:

ldstub -> operates on a 8 byte value
casa -> operates on a 32-bit word 

now, if some one wanted to use these instructions to implement a atomic mutex 
lock, then one could argue that both instruction set are interchangeable. in 
this case, that is not the case. hence, i would argue that there is a valid case 
for using this specific 'compare and swap' instruction set. 

using 'ldstub instruction set' in this context is not what we want.  

few curl or http get requests cannot display the potential race conditions. 

not using 'atomic' operation will be the better approach for your scenario.
 [2010-11-18 02:11 UTC] stefan at whocares dot de
Thank you for your input. Just so I understand better:
You said "in this context is not what we want". Could you clarify that a bit? As 
far as I understand the code (and I don't claim to fully understand it) it just 
checks whether if a specific memory region contains a value of "0" and if so, it 
tries to set it to "1". At least I couldn't find any calls to the atomic "add" 
functions although they are provided for some architectures.

Also you said: "not using 'atomic' operation will be the better approach for 
your scenario". Would you have an example / explanation on how to do that? I'd 
really be interested in that so I could eventually come up with a good and 
working solution.
 [2010-11-18 12:30 UTC] stefan at whocares dot de
As per request here the results when running 'ab' against a patched version of 
PHP 5.3.3 running on the ReadyNAS. 

Web Server: Nginx 0.8.53
PHP-FPM   : Running with dynamic processes, 2 min, 2 minspare, 3 maxspare

Please keep in mind that the CPU of the ReadyNAS is running at whooping 186 MHz, 
so the results obviously won't be lightning fast:


desktop:~$ ab -c 100 -n 10000 http://develnas:8880/test.php
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd,
Licensed to The Apache Software Foundation,

Benchmarking develnas (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests

Server Software:        nginx/0.8.53
Server Hostname:        develnas
Server Port:            8880

Document Path:          /test.php
Document Length:        16 bytes

Concurrency Level:      100
Time taken for tests:   110.629 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      1700000 bytes
HTML transferred:       160000 bytes
Requests per second:    90.39 [#/sec] (mean)
Time per request:       1106.289 [ms] (mean)
Time per request:       11.063 [ms] (mean, across all concurrent requests)
Transfer rate:          15.01 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.6      0      10
Processing:   105 1101 212.8   1122    2348
Waiting:      105 1100 212.8   1121    2348
Total:        107 1101 212.7   1122    2349

Percentage of the requests served within a certain time (ms)
  50%   1122
  66%   1129
  75%   1139
  80%   1145
  90%   1547
  95%   1552
  98%   1564
  99%   1571
 100%   2349 (longest request)
 [2010-11-24 01:44 UTC]
well, what i meant by 'in this context' is -> the compare and set is a general 
purpose function within fpm and is not intended to simply test&swap a single 
byte . hence, it won't be appropriate to replace cas with ldstub instruction.

also, what i meant by not using the atomic option is - fpm currently implements 
varieties of common api's by using assembly instructions. It is possible to do  
the same by using mutex lock and regular c code. this alternate way allows some 
one to run in varieties of architecture. though, this might run slightly slower. 
doing this way would be a better option for your case.
 [2010-11-24 02:02 UTC] stefan at whocares dot de
Thanks for your input, greatly appreciated.
It'd be nice if you could help me out a bit more. I grepped through the whole 
source and the only use of 'atomic_cmp_set' I could find was within fpm_atomic.h 

(333)sparky:~/devel/build/php5-5.3.3# grep -R 'atomic_cmp_set' *
sapi/fpm/fpm/fpm_atomic.h:static inline atomic_uint_t atomic_cmp_set(atomic_t 
*lock, atomic_uint_t old, atomic_uint_t set) /* {{{ */
sapi/fpm/fpm/fpm_atomic.h:static inline atomic_uint_t atomic_cmp_set(atomic_t 
*lock, atomic_uint_t old, atomic_uint_t set) /* {{{ */
sapi/fpm/fpm/fpm_atomic.h:static inline atomic_uint_t atomic_cmp_set(atomic_t 
*lock, atomic_uint_t old, atomic_uint_t set) /* {{{ */
sapi/fpm/fpm/fpm_atomic.h:static inline atomic_uint_t atomic_cmp_set(atomic_t 
*lock, atomic_uint_t old, atomic_uint_t set) /* {{{ */
sapi/fpm/fpm/fpm_atomic.h:static inline atomic_uint_t atomic_cmp_set(atomic_t 
*lock, atomic_uint_t old, atomic_uint_t set) /* {{{ */
sapi/fpm/fpm/fpm_atomic.h:		return atomic_cmp_set(lock, 0, 1) ? 0 : 
sapi/fpm/fpm/fpm_atomic.h:		if (atomic_cmp_set(lock, 0, 1)) {

Since you're saying it's a general purpose function I'm obviously missing 
something here and because I'm an enquiring mind I'd like to know what it is I'm 

I agree that using C code would improve portability. But since the machine I'm 
building for is slow enough already, I'd prefer to stick with assembler if 
 [2013-02-25 17:35 UTC] zulrang at dontspamme dot com
stefan, wasn't sure if you're still subscribed to this, but I'd like to know if 
you found a workaround/patch/etc.?

ALL of our servers are run on Sparcv8 / Solaris 11, so this issues is very 
important to me.
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Fri May 29 05:01:25 2020 UTC