php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72803 Failure to transliterate on Linux
Submitted: 2016-08-10 13:10 UTC Modified: 2018-08-25 13:17 UTC
From: rasmus at mindplay dot dk Assigned: cmb (profile)
Status: Not a bug Package: ICONV related
PHP Version: Irrelevant OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: rasmus at mindplay dot dk
New email:
PHP Version: OS:

 

 [2016-08-10 13:10 UTC] rasmus at mindplay dot dk
Description:
------------
I've come across one character that doesn't transliterate correctly with iconv() with the "TRANSLIT" option on Linux.

On Windows, it works - on every Linux build (PHP 5.3, 5.4, 5.5, 5.6, 7.0 and HHVM) it fails for this one character.

Unfortunately, this particular character is a normal character in Danish language, and the sites we build are in Danish.

The expected result posted below is that from a Windows build - the actual result is the erroneous result from a Linux build.

I know that libiconv is largely unmaintained, so maybe there is nothing to be done about this - a comment in the manual says to use mbstring and intl instead, but intl isn't a standard extension, and mbstring doesn't seem to have this feature?

I can work around this particular case, of course, by replacing ø and Ø myself first, but I find it problematic that this common operation isn't otherwise supported (or doesn't work) in PHP.


Test script:
---------------
<?php

// https://gist.github.com/mindplay-dk/9b2fa55ba5f08ab0b9308295f936cd41

$string = 'æøå'; // HEX: c3 a6 c3 b8 c3 a5

$clean = iconv('UTF-8', 'ASCII//TRANSLIT', $string);

var_dump($clean); // "aeoa" on Windows, "ae?a" on Linux???


Expected result:
----------------
string(4) "aeoa"

Actual result:
--------------
string(4) "ae?a"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-08-10 17:07 UTC] cmb@php.net
Looks even worse for glibc-2.2.3/2: <https://3v4l.org/JNOga>.
Which version did you use?

Anyhow, I think that's not a PHP issue, but rather should be
solved upstream.
 [2016-08-15 08:01 UTC] rasmus at mindplay dot dk
To others looking for a solution, the only viable approach appears to be the "intl" extension - it's not enabled by default, but it does ship with the PHP binaries.

See here:

http://php.net/transliterator_transliterate
 [2018-08-25 13:17 UTC] cmb@php.net
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2018-08-25 13:17 UTC] cmb@php.net
Closing as not a bug, since it is obviously a bug or limitation of
glibc's iconv implementation.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Thu Jul 03 11:01:34 2025 UTC