php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #44868 Replaces UTF-8 symbol with incorrect symbol
Submitted: 2008-04-30 11:45 UTC Modified: 2009-04-28 01:00 UTC
Votes:5
Avg. Score:4.8 ± 0.4
Reproduced:3 of 4 (75.0%)
Same Version:2 (66.7%)
Same OS:2 (66.7%)
From: colourmusic at gmail dot com Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 6CVS-2008-04-30 (snap) OS: Win XP SP2
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: colourmusic at gmail dot com
New email:
PHP Version: OS:

 

 [2008-04-30 11:45 UTC] colourmusic at gmail dot com
Description:
------------
I parsed url with UTF-8 encoding and noticed that UTF symbol 8 ( 8 = EF BC 98 code units) replaces to EF BC 5F code units that are not correct utf symbol.

Script didn't generate errors and warnings.

Also I noticed that utf symbols from 0 (0) to 7  (7) and 9 (9) parses by parse_url() without any problems.

This bug also appears on PHP 5.2.3 and PHP 5.2.5

 

Reproduce code:
---------------
<?php
// mb_convert_encoding() provides same result as html_entity_decode() in this example
	//$url = mb_convert_encoding("https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;&#65304;&#65296;&#12496;&#12531;&amp;SHAMEI_CD=01465,", "utf-8", "html-entities");
	$url = html_entity_decode("https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;&#65305;&#65296;&#12496;&#12531;&amp;SHAMEI_CD=01465,",null,"utf-8");
	echo "Original URL = $url <br />\n";
	$result = parse_url($url);
	echo print_r($result);
?>

Expected result:
----------------
Original URL = https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;80&#12496;&#12531;&SHAMEI_CD=01465, 

Array
(
    [scheme] => https
    [host] => example.com
    [path] => /
    [query] => SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;80&#12496;&#12531;&SHAMEI_CD=01465,
)

Actual result:
--------------
Original URL = https://example.com/?SHAMEI=&#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;80&#12496;&#12531;&SHAMEI_CD=01465,

Array
(
    [scheme] => https
    [host] => example.com
    [path] => /
    [query] => &#12521;&#12531;&#12489;&#12463;&#12523;&#12540;&#12470;&#12540;&#65533;_0&#12496;&#12531;&SHAMEI_CD=01465,

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-04-20 18:18 UTC] jani@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php6.0-latest.tar.gz
 
For Windows:

  http://windows.php.net/snapshots/


 [2009-04-28 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 11:01:27 2024 UTC