|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2008-09-19 20:16 UTC] areid at lumerical dot com
Description:
------------
The mb_check_encoding function returns false when a particular Japanese character is used with the iso-2022-jp character set. The offending character has hex code 2d6a. This is a special character representing "incorporated". The character itself does not seem to be in the JIS X 0208-1983 character table, but most windows applications seem to recognize it (Outlook, Firefox, Explorer, etc). In this particular case, the original text was composed in Outlook.
Reproduce code:
---------------
//This is valid iso-2022-jp code for
//this single Japanese character representing incorporated
$txt = "\x1b\x24\x42\x2d\x6a";
//The output of the below code will be "bad encoding"
if(mb_check_encoding($txt,'ISO-2022-JP')){
echo 'good encoding';
}else{
echo 'bad encoding';
Expected result:
----------------
"good encoding" should be printed
Actual result:
--------------
"bad encoding" is printed
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Mon Nov 24 06:00:02 2025 UTC |
$txt = "\x1b\x24\x42\x2d\x6a" is wrong ISO-2022-JP encoded string. It should be $txt = "\x1b\x24\x42\x2d\x6a\x1b\x28\x42". Please try: <?php $txt = "\x1b\x24\x42\x2d\x6a\x1b\x28\x42"; if(mb_check_encoding($txt,'ISO-2022-JP')){ echo 'good encoding'; } else{ echo 'bad encoding'; } ?> result: good encodingISO-2022-JP doesn't include the vendor specific characters. Please use ISO-2022-JP-MS instead of ISO-2022-JP. And, $txt = "\x1b\x24\x42\x2d\x6a" is wrong ISO-2022-JP encoded string. It should be $txt = "\x1b\x24\x42\x2d\x6a\x1b\x28\x42". Please try: <?php $txt = "\x1b\x24\x42\x2d\x6a\x1b\x28\x42"; if(mb_check_encoding($txt,'ISO-2022-JP-MS')){ echo 'good encoding'; } else{ echo 'bad encoding'; } ?> result: good encoding