php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #74230 iconv fails to fail on surrogates
Submitted: 2017-03-09 10:48 UTC Modified: 2017-03-27 18:24 UTC
From: paul dot crovella at gmail dot com Assigned: ab (profile)
Status: Closed Package: ICONV related
PHP Version: 7.1.2 OS: Windows
Private report: No CVE-ID: None
 [2017-03-09 10:48 UTC] paul dot crovella at gmail dot com
Description:
------------
Surrogate codepoints cannot be validly encoded in UTF-8. Iconv should return false when given a byte sequence that would encode them and an in_charset of 'UTF-8'.

This works as expected on Linux https://3v4l.org/Ff6cr but doesn't on Windows (tested with PHP 7.0.16 and 7.1.2.)

Test script:
---------------
$high = "\xED\xa1\x92"; // codepoint D852
$low = "\xED\xBD\xA2"; // codepoint DF62
$pair = $high.$low;
var_dump(
    @\iconv('UTF-8', 'UTF-8', $high) === false,
    @\iconv('UTF-8', 'UTF-8', $low) === false,
    @\iconv('UTF-8', 'UTF-8', $pair) === false
);

Expected result:
----------------
bool(true)
bool(true)
bool(true)

Actual result:
--------------
bool(false)
bool(false)
bool(false)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-03-13 12:41 UTC] ab@php.net
Thanks for the report. This seems to be the libiconv issue. The library used is quite old, but surprisingly libiconv-1.15 was released a month ago after 6 years of silence :) I'm goint to check with it. Otherwise, in 7.1 you might get an acceptable result with sapi_windows_cp_conv().

Thanks.
 [2017-03-13 20:29 UTC] paul dot crovella at gmail dot com
Wow, and their very first bullet point for that release is:

 - The UTF-8 converter now rejects surrogates and out-of-range code points.

http://savannah.gnu.org/forum/forum.php?forum_id=8790

Talk about timing :D
 [2017-03-27 18:24 UTC] ab@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: ab
 [2017-03-27 18:24 UTC] ab@php.net
The deps was updated with libiconv-1.15 for all 7.x branches. Currently you can fetch any latest master snap for the tests and OFC teh upcoming RCs this week.

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 11:01:29 2024 UTC