|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[2009-04-20 18:18 UTC] jani@php.net
[2009-04-28 01:00 UTC] php-bugs at lists dot php dot net
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Oct 29 23:00:01 2025 UTC |
Description: ------------ I parsed url with UTF-8 encoding and noticed that UTF symbol 8 ( 8 = EF BC 98 code units) replaces to EF BC 5F code units that are not correct utf symbol. Script didn't generate errors and warnings. Also I noticed that utf symbols from 0 (0) to 7 (7) and 9 (9) parses by parse_url() without any problems. This bug also appears on PHP 5.2.3 and PHP 5.2.5 Reproduce code: --------------- <?php // mb_convert_encoding() provides same result as html_entity_decode() in this example //$url = mb_convert_encoding("https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465,", "utf-8", "html-entities"); $url = html_entity_decode("https://example.com/?SHAMEI=ランドクルーザー90バン&SHAMEI_CD=01465,",null,"utf-8"); echo "Original URL = $url <br />\n"; $result = parse_url($url); echo print_r($result); ?> Expected result: ---------------- Original URL = https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465, Array ( [scheme] => https [host] => example.com [path] => / [query] => SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465, ) Actual result: -------------- Original URL = https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465, Array ( [scheme] => https [host] => example.com [path] => / [query] => ランドクルーザー�_0バン&SHAMEI_CD=01465,