php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #47366 mb_convert_encoding() converts some symbols incorrectly from EUC-JP to UTF-8
Submitted: 2009-02-12 10:04 UTC Modified: 2009-02-16 15:13 UTC
From: max at injapan dot ru Assigned:
Status: Closed Package: mbstring related
PHP Version: 5.3CVS-2009-02-12 (snap) OS: CentOS 5.2
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: max at injapan dot ru
New email:
PHP Version: OS:

 

 [2009-02-12 10:04 UTC] max at injapan dot ru
Description:
------------
mb_convert_encoding converts symbols \xAD\xB5-\xAD\xBF  incorrectly 
from EUC-JP to UTF-8. It's possible that some other symbols converted 
incorrectly too, but I have no possibility to check it to full 
extent.

Unicode has corresponding codepoints, i.e. U+2161 for Ⅱ.

Majority of EUC-JP texts is converted mormally.

Reproduce code:
---------------
echo mb_convert_encoding("\xAD\xB6", "UTF-8", "EUC-JP");

Expected result:
----------------
string ?Ⅱ? (U+2161)
printed to STDOUT

Actual result:
--------------
string ???
printed to STDOUT

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-02-12 10:06 UTC] max at injapan dot ru
Text in "Expected result" field is messed a little: of course, 
expected output is just one character U+2161.
 [2009-02-16 15:13 UTC] max at injapan dot ru
Problem solved with encoding EUCJP-WIN instead of EUC-JP.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Sun Jul 06 07:01:33 2025 UTC