php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #80922 Domain names are not case sensitive and should be handled accordingly
Submitted: 2021-03-31 01:23 UTC Modified: 2021-11-11 11:59 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: JunkYardMail1 at Frontier dot com Assigned:
Status: Verified Package: I18N and L10N related
PHP Version: 7.4.16 OS: FreeBSD
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2021-03-31 01:23 UTC] JunkYardMail1 at Frontier dot com
Description:
------------
---
From manual page: https://php.net/function.idn-to-ascii
---
"This function converts a Unicode domain name to an IDNA ASCII-compatible format."

Clearly 'My.Domain.Example.com' is not a "Unicode domain name" and thus it should not be altered.  But it is altered.  It is forced to lower case.

idn_to_ascii() and idn_to_utf8() functions force domain names to lower case.  Even non IDN domain names are forced to lower case.

Domain names are not case sensitive and thus their case should not be altered from what is passed to the functions.  Case should be retained as it is provided.


Test script:
---------------
<?php
echo 'Non IDN:' . "\n";
echo 'idn_to_ascii(\'My.Domain.Example.com\');' . "\n";
echo '  Expected result: My.Domain.Example.com' . "\n";
echo '    Actual result: ' . idn_to_ascii('My.Domain.Example.com') . "\n\n";

echo 'idn_to_utf8(\'My.Domain.Example.com\');' . "\n";
echo '  Expected result: My.Domain.Example.com' . "\n";
echo '    Actual result: ' . idn_to_utf8('My.Domain.Example.com') . "\n\n";

echo 'IDN:' . "\n";
echo 'idn_to_ascii(\'Täst.de\');' . "\n";
echo '  Expected result: xn--Tst-qla.de' . "\n";
echo '    Actual result: ' . idn_to_ascii('Täst.de') . "\n\n";

echo 'idn_to_utf8(\'xn--Tst-qla.de\');' . "\n";
echo '  Expected result: Täst.de' . "\n";
echo '    Actual result: ' . idn_to_utf8('xn--Tst-qla.de') . "\n\n";
?>

Expected result:
----------------
# Non IDN:
My.Domain.Example.com
My.Domain.Example.com

# IDN:
xn--Tst-qla.de
Täst.de


Actual result:
--------------
# Non IDN:
my.domain.example.com
my.domain.example.com

# IDN:
xn--tst-qla.de
täst.de


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-04-06 10:33 UTC] cmb@php.net
-Status: Open +Status: Verified -Type: Feature/Change Request +Type: Documentation Problem
 [2021-04-06 10:33 UTC] cmb@php.net
idn_to_ascii() is a relatively simple wrapper of
uidna_nameToASCII_UTF8(), which is documented[1] as:

| Converts a whole domain name into its ASCII form for DNS lookup.

So the PHP documentation isn't quite right.  The fact that the
function returns the lower cased domain name, is just part of the
normalization.

I don't think it makes sense to change that behavior; if you don't
want the lower casing for ASCII domain names, just don't pass
these to idn_to_ascii().

[1] <https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uidna_8h.html#aac0ee59298161bde6220f59927249f21>
 [2021-04-06 19:31 UTC] JunkYardMail1 at Frontier dot com
The issue is broader than just not passing ASCII domain names.

It also affects unicode domain names.  These need to be converted to ASCII IDN for DNS usage.  But should retain their original case as that is what the human input intended, as these are sometimes stored and redisplayed for human consumption as the ASCII IDN or as the original unicode, and should be presented case unchanged.
 [2021-04-06 20:17 UTC] JunkYardMail1 at Frontier dot com
<?php
// This works to prevent forcing lowercase of ASCII domain names.
// But not does not work to prevent forcing lowercase of Unicode domain names.

if (($idn = idn_to_ascii($domain_name)) === strtolower($domain_name)) {
	// Use the original $domain_name
} else {
	// Use the converted $idn
}
?>
 [2021-11-11 11:59 UTC] nikic@php.net
-Package: idn +Package: I18N and L10N related
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Wed Jan 19 17:03:13 2022 UTC