php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #63450 iconv returns false when illegal character encountered
Submitted: 2012-11-06 21:45 UTC Modified: 2016-07-29 08:35 UTC
Votes:10
Avg. Score:4.9 ± 0.3
Reproduced:10 of 10 (100.0%)
Same Version:2 (20.0%)
Same OS:6 (60.0%)
From: trollofdarkness at gmail dot com Assigned: cmb (profile)
Status: Duplicate Package: ICONV related
PHP Version: 5.4.8 OS: Debian 5 Lenny
Private report: No CVE-ID: None
 [2012-11-06 21:45 UTC] trollofdarkness at gmail dot com
Description:
------------
Hi everyone,

I have been, since I think the version 5.3.x is out (and still with 5.4.8), 
experiencing issues with iconv.


Especially, when an illegal character is encountered and the //IGNORE flag is 
set on the target charset, the function returns FALSE instead of just skipping 
this character.

This is problematic because if a single character in a 50 000 chars long string 
is "illegal" then the output is nothing, just for one char... 

It does not happen with the TRANSLIT flag.

I experienced that with UTF8 (from) and ISO-8859-15 (to) charsets, I did not 
test with other ones. Below is an example to reproduce the bug.

Note : I saw there are other bug reports about similar issues, but they're all 
saying the string is cut... In my case, it literally returns false. So, might be 
different? 

Test script:
---------------
<?php

$str = "
foo
è
foo
";
$result = iconv("UTF-8", "ISO-8859-15"."//IGNORE", $str);

var_dump($result); // false, instead of "foo ... foo"

?>

Expected result:
----------------
foo

foo


Actual result:
--------------
false

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-11-06 21:54 UTC] rasmus@php.net
-Status: Open +Status: Not a bug
 [2012-11-06 21:54 UTC] rasmus@php.net
This is not a PHP issue. This is a change in recent versions of libiconv. If you 
link PHP against an older version of libiconv it will work again or you can use 
mbstring_convert_encoding(). And we have a new uconverter extension feature 
coming that will do a better job than either of these. See 
https://wiki.php.net/rfc/uconverter
 [2012-11-06 22:19 UTC] trollofdarkness at gmail dot com
Hi Rasmus,

Thanks for your help!

I will have a look at that on the spot and will post an update to say if it works 
to downgrade the libiconv.
 [2012-11-07 21:58 UTC] trollofdarkness at gmail dot com
Hi,

So, I had a look at it and this is not a libiconv related bug. It is a glibc 
related bug (so, iconv, but the glibc implementation) as I was not using the GNU 
libiconv implementation but the glibc one.

Actually, I had the 2.7 version of glibc. I tested on another machine - a Ubuntu 
12.04 LTS server - where the glibc version was 2.14 and, indeed, the bug was not 
present. So it is in recent versions of glibc.

To correct the problem on Debian, you can recompile PHP to use the libiconv 
implementation instead of the glibc one.

But it is NOT quite easy because PHP looks for glic implementation BEFORE 
libiconv and select it if present... even with every --with-iconv=something 
parameter you can use when running ./configure.

I used the solution presented there : 
<http://stackoverflow.com/questions/4743080/how-can-i-force-php-to-use-the-
libiconv-version-of-iconv-instead-of-the-centos-i/4851065#4851065> and as one of 
the comments states, I had to change global configure file and not (only) the 
one of ext/iconv. (note that, first, you have to actually download libiconv and 
compile it... but that's just wget && ./configure && make && make install).

I now have the libiconv implementation in use and it's working perfectly.

I storngly think PHP should change the behaviour of the configure file, we 
should not have to edit it to use the libiconv implementation, we should just be 
able to use the right configure parameter!
 [2012-11-08 02:24 UTC] aharvey@php.net
-Status: Not a bug +Status: Re-Opened
 [2012-11-08 02:24 UTC] aharvey@php.net
Reopening per above. Anyone more familiar with iconv and the build system want to opine?
 [2013-03-07 20:00 UTC] ezyang@php.net
This is a dupe of https://bugs.php.net/bug.php?id=48147 (not that I don't think it should be fixed!) Here is the glibc bug: http://sourceware.org/bugzilla/show_bug.cgi?id=13541
 [2013-10-23 16:18 UTC] daniel at kukiela dot pl
Hi.
Will this bug be resolved?

I use PHP on Debian and i cannot upgrade Debian to version 7.x and PHP to 5.5.x (via dotdeb.org) because of this problem.

Functions, which uses glibc (like iconv or htmlspecialchars) returns empty string.

How to test iconv:
function testIconv1(){

    set_error_handler('doNothing');
    $r = iconv('utf-8', 'ascii//IGNORE', "\xCE\xB1" . str_repeat('a', 9000));
    restore_error_handler();

    if ($r === false) {
        $code = "UNUSABLE";
    } elseif (($c = strlen($r)) < 9000) {
        $code = "TRUNCATES";
    } elseif ($c > 9000) {
        $code = "BUGGY";
    } else {
        $code = "OK";
    }
    return $code;
}
function doNothing(){}
echo testIconv1();

how to test htmlspecialchars:
var_dump(htmlspecialchars('żółw')); //returns an empty string

I can't upgrade serwer software because of this more than 1 year old bug.
Please, do something with that (i would like to use PHP 5.3 instead of PHP 5.3).


Regards,
Daniel
 [2013-10-23 16:25 UTC] daniel at kukiela dot pl
...and i know, that there is glibc bug. That glibc bug causes cutting off the text. But functions (with some kind of workaround) are usable. Bug in PHP 5.4 and 5.5 makes them completly unusable.

For example HTML Purifier can work with glibc bug (it has some kind of code to workaround this problem - splits text in smaller chunks). But when function does not return anything, there is no possibility to do any workaround.


Regards,
Daniel
 [2014-02-20 22:37 UTC] daniel at kukiela dot pl
Hi.
Are you going to fix this?
I'm stuck with PHP 5.3...


Regards,
Daniel
 [2016-07-29 08:35 UTC] cmb@php.net
-Status: Re-Opened +Status: Duplicate -Assigned To: +Assigned To: cmb
 [2016-07-29 08:35 UTC] cmb@php.net
Indeed, that is a duplicate of bug #48147.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 12:01:29 2024 UTC