php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #44868 Replaces UTF-8 symbol with incorrect symbol
Submitted: 2008-04-30 11:45 UTC Modified: 2009-04-28 01:00 UTC
Votes:5
Avg. Score:4.8 ± 0.4
Reproduced:3 of 4 (75.0%)
Same Version:2 (66.7%)
Same OS:2 (66.7%)
From: colourmusic at gmail dot com Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 6CVS-2008-04-30 (snap) OS: Win XP SP2
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: colourmusic at gmail dot com
New email:
PHP Version: OS:

 

 [2008-04-30 11:45 UTC] colourmusic at gmail dot com
Description:
------------
I parsed url with UTF-8 encoding and noticed that UTF symbol 8 ( 8 = EF BC 98 code units) replaces to EF BC 5F code units that are not correct utf symbol.

Script didn't generate errors and warnings.

Also I noticed that utf symbols from 0 (0) to 7  (7) and 9 (9) parses by parse_url() without any problems.

This bug also appears on PHP 5.2.3 and PHP 5.2.5

 

Reproduce code:
---------------
<?php
// mb_convert_encoding() provides same result as html_entity_decode() in this example
	//$url = mb_convert_encoding("https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;&#65304;&#65296;&#12496;&#12531;&amp;SHAMEI_CD=01465,", "utf-8", "html-entities");
	$url = html_entity_decode("https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;&#65305;&#65296;&#12496;&#12531;&amp;SHAMEI_CD=01465,",null,"utf-8");
	echo "Original URL = $url <br />\n";
	$result = parse_url($url);
	echo print_r($result);
?>

Expected result:
----------------
Original URL = https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;80&#12496;&#12531;&SHAMEI_CD=01465, 

Array
(
    [scheme] => https
    [host] => example.com
    [path] => /
    [query] => SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;80&#12496;&#12531;&SHAMEI_CD=01465,
)

Actual result:
--------------
Original URL = https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;80&#12496;&#12531;&SHAMEI_CD=01465,

Array
(
    [scheme] => https
    [host] => example.com
    [path] => /
    [query] => &#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;&#65533;_0&#12496;&#12531;&SHAMEI_CD=01465,

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-04-20 18:18 UTC] jani@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php6.0-latest.tar.gz
 
For Windows:

  http://windows.php.net/snapshots/


 [2009-04-28 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 15 15:01:31 2025 UTC