php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #68108 Collator not properly sorts some string starting with number
Submitted: 2014-09-26 18:24 UTC Modified: 2020-10-23 13:06 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: maik at phpkuh dot de Assigned: cmb (profile)
Status: Not a bug Package: intl (PECL)
PHP Version: 5.5.17 OS: Ubuntu 14.04
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: maik at phpkuh dot de
New email:
PHP Version: OS:

 

 [2014-09-26 18:24 UTC] maik at phpkuh dot de
Description:
------------
Using the collator for a big set of IDs generated from three fields like

`sprintf('%010d#%s#%s', $someSmallNumber, $word1, $word2)`

Sorting this with Collator will provide the expected results if the ID starts with a word, however for numeric prefix sorting will fail.

Tested on 5.5.17-2+deb.sury.org~trusty+1 (cli) and 5.5.15-2+deb.sury.org~trusty+1 (cli) on vagrant VM and native ubuntu 14.04. with different data sets.

Sorting by the first key works fine, by secondary key fails. Using a space or other separator does not have any effect for the tested data set.

Test script:
---------------
<?php

$c = new Collator("en_US"); // should work with any locale

$a = ["02#aa", "02#ab", "03#a", "02#a", "03#x", "03#b", "03#ba", "03#b", "04#a", "03#f"];

echo 'collator sort:' . PHP_EOL;
$c->sort($a, SORT_STRING); // SORT_REGULAR produces same result
var_dump($a);

echo 'expected results:' . PHP_EOL;
sort($a, SORT_STRING);
var_dump($a);'

Expected result:
----------------
array(10) {
  [0] =>
  string(4) "02#a"
  [1] =>
  string(5) "02#aa"
  [2] =>
  string(5) "02#ab"
  [3] =>
  string(4) "03#a"
  [4] =>
  string(4) "03#b"
  [5] =>
  string(4) "03#b"
  [6] =>
  string(5) "03#ba"
  [7] =>
  string(4) "03#f"
  [8] =>
  string(4) "03#x"
  [9] =>
  string(4) "04#a"
}


Actual result:
--------------
array(10) {
  [0] =>
  string(4) "02#a"
  [1] =>
  string(5) "02#ab"
  [2] =>
  string(5) "02#aa"
  [3] =>
  string(4) "03#b"
  [4] =>
  string(4) "03#f"
  [5] =>
  string(5) "03#ba"
  [6] =>
  string(4) "03#x"
  [7] =>
  string(4) "03#b"
  [8] =>
  string(4) "03#a"
  [9] =>
  string(4) "04#a"
}


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-09-26 18:39 UTC] maik at phpkuh dot de
Using a prefix like `a#` will solve the issue.

array(10) {
  [0]=>
  string(6) "a#02#a"
  [1]=>
  string(7) "a#02#aa"
  [2]=>
  string(7) "a#02#ab"
  [3]=>
  string(6) "a#03#a"
  [4]=>
  string(6) "a#03#b"
  [5]=>
  string(6) "a#03#b"
  [6]=>
  string(7) "a#03#ba"
  [7]=>
  string(6) "a#03#f"
  [8]=>
  string(6) "a#03#x"
  [9]=>
  string(6) "a#04#a"
}
 [2020-10-23 13:06 UTC] cmb@php.net
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2020-10-23 13:06 UTC] cmb@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

SORT_STRING != Collator::SORT_STRING, see <https://3v4l.org/qJR25>.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 10:01:28 2024 UTC