php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50189 [PATCH] - unicode byte order difference between SPARC and x86
Submitted: 2009-11-16 12:20 UTC Modified: 2011-03-08 18:39 UTC
From: yoarvi at gmail dot com Assigned:
Status: Not a bug Package: Unicode Engine related
PHP Version: 6SVN-2009-11-16 (SVN) OS: Solaris 10 (SPARC)
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: yoarvi at gmail dot com
New email:
PHP Version: OS:

 

 [2009-11-16 12:20 UTC] yoarvi at gmail dot com
Description:
------------
zspprintf() incorrectly represents strings/chars as unicode characters on Solaris (SPARC).

There are byte ordering differences for unicode representations between x86 and SPARC:

For example, the unicode representation (i've grouped them in sets of 2chars) of '/tmp' on x86 is
'/''\0' 't''\0' 'm''\0' 'p''\0'

and on SPARC it is
'\0''/' '\0''t' '\0''m' '\0''p'

http://marc.info/?l=php-internals&m=125811990106419&w=2 has some more details.

the problem seems to be in the smart_str_append2c macro that zspprintf()/xbuf_format_converter end up using.

The following patch fixes the problem:
Index: ext/standard/php_smart_str.h
===================================================================
--- ext/standard/php_smart_str.h        (revision 290471)
+++ ext/standard/php_smart_str.h        (working copy)
@@ -86,10 +86,17 @@
        smart_str_appendc_ex((dest), (c), 0)

 /* appending of a single UTF-16 code unit (2 byte)*/
+#if (defined(i386) || defined(__i386__) || defined(_X86_))
 #define smart_str_append2c(dest, c) do {       \
        smart_str_appendc_ex((dest), (c&0xFF), 0);      \
        smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0);      \
 } while (0)
+#else
+#define smart_str_append2c(dest, c) do {       \
+       smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0);      \
+       smart_str_appendc_ex((dest), (c&0xFF), 0);      \
+} while (0)
+#endif

 #define smart_str_free(s) \
        smart_str_free_ex((s), 0)



Reproduce code:
---------------
% sapi/cli/php ext/spl/tests/DirectoryIterator_getBasename_basic_test.php

Expected result:
----------------
getBasename_test

Actual result:
--------------
php goes into an infinite loop

Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-11-16 13:07 UTC] tokul at users dot sourceforge dot net
If is not "#if (defined(i386) || defined(__i386__) || defined(_X86_))
" vs others.

It is little endian vs big endian. I suspect that code should not assume that all other archs are big endian.
 [2009-11-16 16:35 UTC] yoarvi at gmail dot com
ext/sqlite3/libsqlite/sqlite3.c uses
#if defined(i386) || defined(__i386__) || defined(_M_IX86)\
                             || defined(__x86_64) || defined(__x86_64__)


Is that better?
 [2009-11-17 10:51 UTC] yoarvi at gmail dot com
Updated patch using WORDS_BIGENDIAN (suggested by christopher dot jones at oracle dot com)

Index: ext/standard/php_smart_str.h
===================================================================
--- ext/standard/php_smart_str.h        (revision 290471)
+++ ext/standard/php_smart_str.h        (working copy)
@@ -86,10 +86,17 @@
        smart_str_appendc_ex((dest), (c), 0)

 /* appending of a single UTF-16 code unit (2 byte)*/
+#ifndef WORDS_BIGENDIAN
 #define smart_str_append2c(dest, c) do {       \
        smart_str_appendc_ex((dest), (c&0xFF), 0);      \
        smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0);      \
 } while (0)
+#else
+#define smart_str_append2c(dest, c) do {       \
+       smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0);      \
+       smart_str_appendc_ex((dest), (c&0xFF), 0);      \
+} while (0)
+#endif

 #define smart_str_free(s) \
        smart_str_free_ex((s), 0)
 [2010-12-17 14:33 UTC] jani@php.net
-Package: Unicode Function Upgrades relate +Package: Unicode Engine related
 [2011-03-08 18:39 UTC] felipe@php.net
-Status: Open +Status: Bogus
 [2011-03-08 18:39 UTC] felipe@php.net
php6 code is dead.
 [2019-07-28 08:05 UTC] chen dot yufeng at m2k dot com dot tw
The following pull request has been associated:

Patch Name: Fix $revisions number
On GitHub:  https://github.com/php/doc-en/pull/6
Patch:      https://github.com/php/doc-en/pull/6.patch
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Apr 30 00:01:25 2025 UTC