|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #30549 incorrect character translations for some ISO-8859 charsets
Submitted: 2004-10-25 09:53 UTC Modified: 2005-02-21 08:50 UTC
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: david at davidheath dot org Assigned: moriyoshi
Status: Closed Package: mbstring related
PHP Version: 4.3.9 OS: linux
Private report: No CVE-ID:
 [2004-10-25 09:53 UTC] david at davidheath dot org
MBstring appears to incorrectly map some characters for the following ISO-8859 charsets, as follows:

Encoding: ISO-8859-7
  incorrect mapping of char 0xa4: got 0x3f, expected 0x20ac
  incorrect mapping of char 0xa5: got 0x3f, expected 0x20af
  incorrect mapping of char 0xaa: got 0x3f, expected 0x37a
Encoding: ISO-8859-8
  incorrect mapping of char 0xaf: got 0x203e, expected 0xaf
  incorrect mapping of char 0xfd: got 0x3f, expected 0x200e
  incorrect mapping of char 0xfe: got 0x3f, expected 0x200f
Encoding: ISO-8859-10
  incorrect mapping of char 0xa4: got 0x124, expected 0x12a

This is based on the mappings provided at on 25th Oct 2004. 

Note, there are undated comments in the "Version history" for the above files, as follows:

#	2.0 version updates 1.0 version by adding mappings for the
#	three newly added characters 0xA4, 0xA5, 0xAA.

#       1.1 version updates to the published 8859-8:1999, correcting
#          the mapping of 0xAF and adding mappings for LRM and RLM.

#       1.1 corrected mistake in mapping of 0xA4

So I guess these mappings have changed since mbstring was first written. I'm not sure if there would be a backward-compatability problem if the mappings were changed.



Reproduce code:
Code for this test is available at:

Expected result:
Mappings as stated "expected xxx" above.

Actual result:
Mappings as stated "got xxx" above.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2004-10-25 10:33 UTC]
Hello David,

can you please make a *short* script that show that the warnings are wrong as it takes quite some time to figure out what your script is exactly doing.

 [2004-10-25 13:32 UTC] david at davidheath dot org
oops, minor bug in that script. Line 35 should read:

            printf("  incorrect mapping of char 0x%x: got 0x%x, expected 0x%x\n", $fromChar, $unicodeCharNumber[''], $expectChar);

Corrected version of script for your cut+paste convenience:





function testMapping($targetEncoding, $map) {
    print "Encoding: $targetEncoding\n";

    foreach($map as $fromChar=>$toChar) {
        $expectChar = $toChar;

        // convert to UCS-4, which represents every possible unicode
        // char as a single fixed width 32bit value
        $unicodeChar=mb_convert_encoding(chr($fromChar), 'UCS-4LE', $targetEncoding);
        $unicodeCharNumber = unpack('L', $unicodeChar);
        if ($expectChar!=$unicodeCharNumber[''] and ($expectChar!=0 and $unicodeCharNumber!=0x3f)) {
            printf("  incorrect mapping of char 0x%x: got 0x%x, expected 0x%x\n", $fromChar, $unicodeCharNumber[''], $expectChar);
 [2005-02-11 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2005-02-17 15:09 UTC]
 [2005-02-21 08:50 UTC]
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
Thank you for the report, and for helping us make PHP better.

PHP Copyright © 2001-2015 The PHP Group
All rights reserved.
Last updated: Sun Oct 04 12:01:29 2015 UTC