php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66254 htmlentities does not convert single-byte katakana chars on Shift-JIS
Submitted: 2013-12-10 02:30 UTC Modified: 2014-12-30 10:42 UTC
Votes:1
Avg. Score:1.0 ± 0.0
Reproduced:0 of 1 (0.0%)
From: charles123zelrax456 at yahoo dot com Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 5.5.7RC1 OS: Linux
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2013-12-10 02:30 UTC] charles123zelrax456 at yahoo dot com
Description:
------------
Japanese katakana characters can be encoded as single-byte especially when using mobile phones. Most of the Japanese mobile sites use Shift-JIS encoding. For security purposes, htmlentities is used in all user inputs. Unfortunately, when single-byte katakana strings encoded in Shift-JIS are passed in htmlentities, garbage characters are returned.

Test script:
---------------
//make sure that the code as well as the browser settings are encoded also in Shift-JIS
$txt = "。「」、・ヲァィゥェォャュョッ ーアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワン゙゚アバ日本語 DOUBLEBYTE SINGLE BYTE<br>";
echo "Original String: " . $txt;
$new_txt = htmlentities($txt, ENT_QUOTES, "SJIS");
echo "<br>Converted String: " . $new_txt;

Actual result:
--------------
Original String: 。「」、・ヲァィゥェォャュョッ ーアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワン゙゚アバ日本語 DOUBLEBYTE SINGLE BYTE

Converted String: ¡¢£¤¥¦§¨©ª«¬­®¯ °±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßアバ日本語 DOUBLEBYTE SINGLE BYTE<br>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-01-04 20:50 UTC] yohgaki@php.net
-Status: Open +Status: Feedback
 [2014-01-04 20:50 UTC] yohgaki@php.net
I've tried on my Fedora19's PHP 5.5 and 5.6-dev with SJIS encoded PHP script and got

Original String: 。「」、・ヲァィゥェォャュョッ ーアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワン゙゚アバ日本語 DOUBLEBYTE SINGLE BYTE<br>
Converted String: 。「」、・ヲァィゥェォャュョッ ーアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワン゙゚アバ日本語 DOUBLEBYTE SINGLE BYTE&lt;br&gt;

Since you are using RC, I suppose you built PHP from source. What is your environment?
 [2014-12-30 10:42 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Apr 28 11:01:30 2024 UTC