php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #74230 iconv fails to fail on surrogates
Submitted: 2017-03-09 10:48 UTC Modified: 2017-03-27 18:24 UTC
From: paul dot crovella at gmail dot com Assigned: ab (profile)
Status: Closed Package: ICONV related
PHP Version: 7.1.2 OS: Windows
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: paul dot crovella at gmail dot com
New email:
PHP Version: OS:

 

 [2017-03-09 10:48 UTC] paul dot crovella at gmail dot com
Description:
------------
Surrogate codepoints cannot be validly encoded in UTF-8. Iconv should return false when given a byte sequence that would encode them and an in_charset of 'UTF-8'.

This works as expected on Linux https://3v4l.org/Ff6cr but doesn't on Windows (tested with PHP 7.0.16 and 7.1.2.)

Test script:
---------------
$high = "\xED\xa1\x92"; // codepoint D852
$low = "\xED\xBD\xA2"; // codepoint DF62
$pair = $high.$low;
var_dump(
    @\iconv('UTF-8', 'UTF-8', $high) === false,
    @\iconv('UTF-8', 'UTF-8', $low) === false,
    @\iconv('UTF-8', 'UTF-8', $pair) === false
);

Expected result:
----------------
bool(true)
bool(true)
bool(true)

Actual result:
--------------
bool(false)
bool(false)
bool(false)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-03-13 12:41 UTC] ab@php.net
Thanks for the report. This seems to be the libiconv issue. The library used is quite old, but surprisingly libiconv-1.15 was released a month ago after 6 years of silence :) I'm goint to check with it. Otherwise, in 7.1 you might get an acceptable result with sapi_windows_cp_conv().

Thanks.
 [2017-03-13 20:29 UTC] paul dot crovella at gmail dot com
Wow, and their very first bullet point for that release is:

 - The UTF-8 converter now rejects surrogates and out-of-range code points.

http://savannah.gnu.org/forum/forum.php?forum_id=8790

Talk about timing :D
 [2017-03-27 18:24 UTC] ab@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: ab
 [2017-03-27 18:24 UTC] ab@php.net
The deps was updated with libiconv-1.15 for all 7.x branches. Currently you can fetch any latest master snap for the tests and OFC teh upcoming RCs this week.

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 03 17:01:29 2024 UTC