php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #47990 mb_check_encoding() accepts surrogates for UTF-8
Submitted: 2009-04-16 15:53 UTC Modified: 2017-07-28 17:42 UTC
Votes:3
Avg. Score:3.3 ± 0.5
Reproduced:1 of 2 (50.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: mercator+bugs at gmail dot com Assigned: moriyoshi (profile)
Status: Closed Package: mbstring related
PHP Version: 5.2.9 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: mercator+bugs at gmail dot com
New email:
PHP Version: OS:

 

 [2009-04-16 15:53 UTC] mercator+bugs at gmail dot com
Description:
------------
mb_check_encoding() wrongly considers surrogates (Unicode range U+D800 - U+DFFF) to be valid for the UTF-8 encoding.

Reproduce code:
---------------
var_dump(mb_check_encoding("\xed\xa0\x80",'UTF-8'));

Expected result:
----------------
bool(false)

Actual result:
--------------
bool(true)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-01-18 08:53 UTC] deceze at gmail dot com
This seems to be fixed in PHP 5.3, it returns false as expected. Close?
 [2017-07-28 17:42 UTC] nikic@php.net
-Status: Assigned +Status: Closed
 [2017-07-28 17:42 UTC] nikic@php.net
Closing per previous comment.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Feb 05 09:01:30 2025 UTC