php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72809 Locale::lookup() / locale_lookup() wrong result with canonicalize option
Submitted: 2016-08-11 08:20 UTC Modified: 2021-06-11 17:23 UTC
Votes:4
Avg. Score:4.8 ± 0.4
Reproduced:4 of 4 (100.0%)
Same Version:3 (75.0%)
Same OS:-1 (-25.0%)
From: a dot schilder at gmx dot de Assigned: cmb (profile)
Status: Closed Package: intl (PECL)
PHP Version: 7.1.0beta2 OS: Windows 10 Pro
Private report: No CVE-ID: None
 [2016-08-11 08:20 UTC] a dot schilder at gmx dot de
Description:
------------
When I call Locale::lookup as follows, I get the expected result 'en-US':
Locale::lookup(['en', 'en-US'], 'en-US-u-cu-EUR-tz-deber-fw-mon', false);

I should get the same result when using the canonicalize option, because following the docs all "arguments will be converted to canonical form before matching" and after conversion, they should still match - but I get 'en' instead of the expected locale 'en-US':
Locale::lookup(['en', 'en-US'], 'en-US-u-cu-EUR-tz-deber-fw-mon', true);

I also get 'en' when using the canonicalized versions directly (no matter if the
canonicalize option is set or not):
echo Locale::lookup(['en', 'en_US'], 'en_US@currency=eur;fw=mon;timezone=Europe/Berlin', false) . PHP_EOL;
echo Locale::lookup(['en', 'en_US'], 'en_US@currency=eur;fw=mon;timezone=Europe/Berlin', true) . PHP_EOL;

Test script:
---------------
<?php
echo Locale::lookup(['en', 'en-US'], 'en-US-u-cu-EUR-tz-deber-fw-mon', false) . PHP_EOL;
echo PHP_EOL;
echo Locale::canonicalize('en-US-u-cu-EUR-tz-deber-fw-mon') . PHP_EOL;
echo Locale::canonicalize('en-US') . PHP_EOL;
echo Locale::canonicalize('en') . PHP_EOL;
echo PHP_EOL;
echo Locale::lookup(['en', 'en-US'], 'en-US-u-cu-EUR-tz-deber-fw-mon', true) . PHP_EOL;
echo Locale::lookup(['en', 'en_US'], 'en_US@currency=eur;fw=mon;timezone=Europe/Berlin', false) . PHP_EOL;
echo Locale::lookup(['en', 'en_US'], 'en_US@currency=eur;fw=mon;timezone=Europe/Berlin', true) . PHP_EOL;

Expected result:
----------------
en-US

en_US@currency=eur;fw=mon;timezone=Europe/Berlin
en_US
en

en-US
en_US
en_US

Actual result:
--------------
en-US

en_US@currency=eur;fw=mon;timezone=Europe/Berlin
en_US
en

en
en
en

Patches

Add a Patch

Pull Requests

Pull requests:

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-06-11 17:16 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2021-06-11 17:16 UTC] cmb@php.net
The actual lookup tries to match any given $langtag element to the
full $locale; if there is no match, the $locale's suffix starting
with - or _ is stripped, and matching is done on the stripped
$locale (and so on).  However, the canonicalized $locale is
separated by rather different characters.  Maybe only keyword
separators (@) would need to be catered to in addition?  Would
need to check against RFC 4647[1].

But still, that would return the canonicalized $langtag element,
what doesn't look right.

[1] <https://datatracker.ietf.org/doc/html/rfc4647>
 [2021-06-11 17:23 UTC] cmb@php.net
-Assigned To: +Assigned To: cmb
 [2021-06-14 13:03 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #72809: Locale::lookup() wrong result with canonicalize option
On GitHub:  https://github.com/php/php-src/pull/7151
Patch:      https://github.com/php/php-src/pull/7151.patch
 [2021-06-16 08:39 UTC] git@php.net
-Status: Verified +Status: Closed
 [2021-06-16 08:39 UTC] git@php.net
Automatic comment on behalf of cmb69
Revision: https://github.com/php/php-src/commit/0f1b17e37894a77f297129a25368e68bacc31a08
Log: Fix #72809: Locale::lookup() wrong result with canonicalize option
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Fri Jul 30 11:01:23 2021 UTC