php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #68108 Collator not properly sorts some string starting with number
Submitted: 2014-09-26 18:24 UTC Modified: 2020-10-23 13:06 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: maik at phpkuh dot de Assigned: cmb (profile)
Status: Not a bug Package: intl (PECL)
PHP Version: 5.5.17 OS: Ubuntu 14.04
Private report: No CVE-ID: None
 [2014-09-26 18:24 UTC] maik at phpkuh dot de
Description:
------------
Using the collator for a big set of IDs generated from three fields like

`sprintf('%010d#%s#%s', $someSmallNumber, $word1, $word2)`

Sorting this with Collator will provide the expected results if the ID starts with a word, however for numeric prefix sorting will fail.

Tested on 5.5.17-2+deb.sury.org~trusty+1 (cli) and 5.5.15-2+deb.sury.org~trusty+1 (cli) on vagrant VM and native ubuntu 14.04. with different data sets.

Sorting by the first key works fine, by secondary key fails. Using a space or other separator does not have any effect for the tested data set.

Test script:
---------------
<?php

$c = new Collator("en_US"); // should work with any locale

$a = ["02#aa", "02#ab", "03#a", "02#a", "03#x", "03#b", "03#ba", "03#b", "04#a", "03#f"];

echo 'collator sort:' . PHP_EOL;
$c->sort($a, SORT_STRING); // SORT_REGULAR produces same result
var_dump($a);

echo 'expected results:' . PHP_EOL;
sort($a, SORT_STRING);
var_dump($a);'

Expected result:
----------------
array(10) {
  [0] =>
  string(4) "02#a"
  [1] =>
  string(5) "02#aa"
  [2] =>
  string(5) "02#ab"
  [3] =>
  string(4) "03#a"
  [4] =>
  string(4) "03#b"
  [5] =>
  string(4) "03#b"
  [6] =>
  string(5) "03#ba"
  [7] =>
  string(4) "03#f"
  [8] =>
  string(4) "03#x"
  [9] =>
  string(4) "04#a"
}


Actual result:
--------------
array(10) {
  [0] =>
  string(4) "02#a"
  [1] =>
  string(5) "02#ab"
  [2] =>
  string(5) "02#aa"
  [3] =>
  string(4) "03#b"
  [4] =>
  string(4) "03#f"
  [5] =>
  string(5) "03#ba"
  [6] =>
  string(4) "03#x"
  [7] =>
  string(4) "03#b"
  [8] =>
  string(4) "03#a"
  [9] =>
  string(4) "04#a"
}


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-09-26 18:39 UTC] maik at phpkuh dot de
Using a prefix like `a#` will solve the issue.

array(10) {
  [0]=>
  string(6) "a#02#a"
  [1]=>
  string(7) "a#02#aa"
  [2]=>
  string(7) "a#02#ab"
  [3]=>
  string(6) "a#03#a"
  [4]=>
  string(6) "a#03#b"
  [5]=>
  string(6) "a#03#b"
  [6]=>
  string(7) "a#03#ba"
  [7]=>
  string(6) "a#03#f"
  [8]=>
  string(6) "a#03#x"
  [9]=>
  string(6) "a#04#a"
}
 [2020-10-23 13:06 UTC] cmb@php.net
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2020-10-23 13:06 UTC] cmb@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

SORT_STRING != Collator::SORT_STRING, see <https://3v4l.org/qJR25>.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 06:01:30 2024 UTC