|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[2008-01-29 21:59 UTC] rasmus@php.net
[2008-01-29 23:21 UTC] rasmus@php.net
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sun Dec 14 13:00:01 2025 UTC |
Description: ------------ utf8_decode() outputs a random character when supplied with bad input. When invalid sequences are added, utf8_decoded() usually replace the sequence with the character "?". But when a lonely highbit character is present in the end the output seem to be a random character. Reproduce code: --------------- for($a=0;$a<20;$a++)printf("%02x ",utf8_decode(chr(0xE0))); Expected result: ---------------- 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f (utf8_decode() returns a question mark) Actual result: -------------- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 or 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 or 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 or 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 or some other random value It seem to differ more with individual runs: $ for a in `seq 1 20`; do php -r 'printf("%02x ",utf8_decode(chr(0xE0)));'; done 08 00 00 02 00 00 00 00 00 05 00 00 00 05 00 00 07 00 09 00