|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80961 PCRE treats ellipsis character as a letter
Submitted: 2021-04-16 09:14 UTC Modified: 2021-04-16 09:28 UTC
Avg. Score:4.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: f dot sowade at r9e dot de Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 8.0.3 OS: Linux and Mac OS X
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: f dot sowade at r9e dot de
New email:
PHP Version: OS:


 [2021-04-16 09:14 UTC] f dot sowade at r9e dot de
The intl extension reports the Unicode ellipsis character (U+2026) to be a punctuation. PCRE matches the ellipsis when matching letters but not when matching punctuation. So the character class for the ellipsis differs between the intl extension and PCRE. I think that intl is correct here and the character class should be punctuation. But both extensions should at least agree on the same character class.

Test script:
var_dump(IntlChar::ispunct("\u{2026}")); // true
var_dump(IntlChar::isalpha("\u{2026}")); // false

var_dump(preg_match('/\\p{Z}/', "\u{2026}")); // 0 (but would expect 1)
var_dump(preg_match('/\\p{L}/', "\u{2026}")); // 1 (but would expect 0)

Expected result:

Actual result:


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2021-04-16 09:19 UTC] f dot sowade at r9e dot de
Sorry - the Test script should have had a P instead of a Z:

var_dump(IntlChar::ispunct("\u{2026}")); // true
var_dump(IntlChar::isalpha("\u{2026}")); // false

var_dump(preg_match('/\\p{P}/', "\u{2026}")); // 0 (but would expect 1)
var_dump(preg_match('/\\p{L}/', "\u{2026}")); // 1 (but would expect 0)
 [2021-04-16 09:28 UTC]
-Status: Open +Status: Not a bug
 [2021-04-16 09:28 UTC]
You are missing the /u modifier and seem to have confused \p{P} with \p{Z}. The correct code is:

var_dump(preg_match('/\\p{P}/u', "\u{2026}")); // 1
var_dump(preg_match('/\\p{L}/u', "\u{2026}")); // 0
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Thu Jan 27 06:03:35 2022 UTC