php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #47990 mb_check_encoding() accepts surrogates for UTF-8
Submitted: 2009-04-16 15:53 UTC Modified: 2017-07-28 17:42 UTC
Votes:3
Avg. Score:3.3 ± 0.5
Reproduced:1 of 2 (50.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: mercator+bugs at gmail dot com Assigned: moriyoshi (profile)
Status: Closed Package: mbstring related
PHP Version: 5.2.9 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: mercator+bugs at gmail dot com
New email:
PHP Version: OS:

 

 [2009-04-16 15:53 UTC] mercator+bugs at gmail dot com
Description:
------------
mb_check_encoding() wrongly considers surrogates (Unicode range U+D800 - U+DFFF) to be valid for the UTF-8 encoding.

Reproduce code:
---------------
var_dump(mb_check_encoding("\xed\xa0\x80",'UTF-8'));

Expected result:
----------------
bool(false)

Actual result:
--------------
bool(true)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-01-18 08:53 UTC] deceze at gmail dot com
This seems to be fixed in PHP 5.3, it returns false as expected. Close?
 [2017-07-28 17:42 UTC] nikic@php.net
-Status: Assigned +Status: Closed
 [2017-07-28 17:42 UTC] nikic@php.net
Closing per previous comment.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 20:01:29 2024 UTC