|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #62545 wrong unicode mapping in some charsets
Submitted: 2012-07-12 23:04 UTC Modified: -
From: sageptr at gmail dot com Assigned:
Status: Closed Package: mbstring related
PHP Version: 5.4.4 OS: Any
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: sageptr at gmail dot com
New email:
PHP Version: OS:


 [2012-07-12 23:04 UTC] sageptr at gmail dot com
static const unsigned short cp1251_ucs_table[] = {
 0x0402, 0x0403, 0x201a, 0x0453, 0x201e, 0x2026, 0x2020, 0x2021, 
 0x20ac, 0x2030, 0x0409, 0x2039, 0x040a, 0x040c, 0x040b, 0x040f, 
 0x0452, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014, 
 0x003f, 0x2122, 0x0459, 0x203a, 0x045a, 0x045c, 0x045b, 0x045f, 
Character 0x98 is mapped to 0x003f (question mark), but actually it's unmapped 
in cp1251 charset. It should be mapped to 0xfffd (substitution character), not 
to 0x003f.

static const unsigned short cp1252_ucs_table[] = {
Missing characters are mapped to 0xfffe. But actually it's BOM character, not 
substitution character, as it expected to be.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2018-03-11 17:21 UTC]
Automatic comment on behalf of
Log: Fix #62545: wrong unicode mapping in some charsets
 [2018-03-11 17:21 UTC]
-Status: Open +Status: Closed
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Feb 27 01:01:28 2024 UTC