php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #73750 Additions to documentation on iconv() //TRANSLIT option
Submitted: 2016-12-15 16:43 UTC Modified: 2017-01-07 14:50 UTC
From: codedokode at gmail dot com Assigned: cmb (profile)
Status: Closed Package: ICONV related
PHP Version: Irrelevant OS: *
Private report: No CVE-ID: None
 [2016-12-15 16:43 UTC] codedokode at gmail dot com
Description:
------------
---
From manual page: http://www.php.net/function.iconv
---

The documentation on iconv() mentions //TRANSLIT option that transliterates the characters. But this option does not always work, it depends on what library is providing iconv() function. I think we should add a note about it so that the users would not have to look through PHP sources, build configurations and external libraries documentation to find out why their code does not work as expected.  

For example this code behaves differently on different OS: 

<?php 
var_dump(iconv("utf-8", "ASCII//TRANSLIT", "Hello Привет ßö"));

On Linux it prints "Hello ?????? sso" (cyrillic characters are transliterated into question marks). On Windows iconv() returns false and a notice is generated ("PHP Notice:  iconv(): Detected an illegal character in input string").

That is because PHP uses glibc version of iconv() on Linux and something else (libiconv, I guess) on Windows. This is difficult to find out unless the developer knows C and has a patience to understand how PHP iconv() function is implemented. I understand that PHP provides just a wrapper around external function but now the documentation asserts that //TRANSLIT option works.

I suggest adding the following notice to documentation: 

//TRANSLIT option currently only works in Linux versions of PHP built with iconv()  function impoted from glibc (<how to find out what library is used>). On Windows this option usually doesn't work and iconv() will generate notice and return false if the character cannot be represented in the output charset. The developers of portable programs are recommended not to use it.

//TRANSLIT option does not transliterates characters from non-latin alphabets into latin characters even if they look or sound the same. For example, when converting from utf-8 to ASCII//TRANSLIT it will convert non-latin letters to question marks, not latin characters.


Test script:
---------------
<?php 
var_dump(iconv("utf-8", "ASCII//TRANSLIT", "Hello Привет ßö"));



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-12-31 18:41 UTC] cmb@php.net
-Assigned To: +Assigned To: cmb
 [2017-01-07 14:48 UTC] cmb@php.net
-Status: Assigned +Status: Verified -Package: Documentation problem +Package: ICONV related -Operating System: Windows +Operating System: *
 [2017-01-07 14:50 UTC] cmb@php.net
Automatic comment from SVN on behalf of cmb
Revision: http://svn.php.net/viewvc/?view=revision&amp;revision=341618
Log: Fix #73750: Additions to documentation on iconv() //TRANSLIT option
 [2017-01-07 14:50 UTC] cmb@php.net
-Status: Verified +Status: Closed
 [2017-01-07 14:50 UTC] cmb@php.net
This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.
 [2020-02-07 06:06 UTC] phpdocbot@php.net
Automatic comment on behalf of cmb
Revision: http://git.php.net/?p=doc/en.git;a=commit;h=9e5c6d9150a2fff81dae5d92880740d68f2913f8
Log: Fix #73750: Additions to documentation on iconv() //TRANSLIT option
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Thu May 13 22:01:24 2021 UTC