php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79233 Unicode character properties not working properly
Submitted: 2020-02-06 03:55 UTC Modified: 2020-02-06 07:44 UTC
Votes:12
Avg. Score:4.8 ± 0.6
Reproduced:12 of 12 (100.0%)
Same Version:11 (91.7%)
Same OS:8 (66.7%)
From: yahiru1121 at gmail dot com Assigned:
Status: Open Package: *Regular Expressions
PHP Version: 7.4.2 OS: macOS Mojave 10.14.6
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: yahiru1121 at gmail dot com
New email:
PHP Version: OS:

 

 [2020-02-06 03:55 UTC] yahiru1121 at gmail dot com
Description:
------------
Combining unicode property codes with supported scripts produces unexpected results


Test script:
---------------
<?php

$str = 'aAaA亜あアア11ⅰʰ';
$regs = [
    '/[\p{Nd}\p{Han}]/u',
    '/[\p{Nl}\p{Han}]/u',
    '/[\p{Lu}\p{Han}]/u',
    '/[\p{Ll}\p{Han}]/u',
    '/[\p{Lo}\p{Han}]/u',
    '/[\p{Lm}\p{Han}]/u',
];


foreach ($regs as $reg) {
    echo $reg, ' => ', preg_replace($reg, 'x️', $str), PHP_EOL;
}


Expected result:
----------------
/[\p{Nd}\p{Han}]/u => aAaAx️あアアx️x️ⅰʰ
/[\p{Nl}\p{Han}]/u => aAaAx️あアア11x️ʰ
/[\p{Lu}\p{Han}]/u => ax️ax️x️あアア11ⅰʰ
/[\p{Ll}\p{Han}]/u => x️Ax️Ax️あアア11ⅰʰ
/[\p{Lo}\p{Han}]/u => aAaAx️x️x️x️11ⅰʰ
/[\p{Lm}\p{Han}]/u => aAaAx️あアア11ⅰx️

Actual result:
--------------
/[\p{Nd}\p{Han}]/u => aAaAx️あアア11ⅰʰ
/[\p{Nl}\p{Han}]/u => aAaAx️あアア11ⅰʰ
/[\p{Lu}\p{Han}]/u => aAaAx️あアアx️x️ⅰʰ
/[\p{Ll}\p{Han}]/u => ax️ax️x️あアア11ⅰʰ
/[\p{Lo}\p{Han}]/u => aAaAx️あアア11ⅰʰ
/[\p{Lm}\p{Han}]/u => aAaAx️あアア11ⅰʰ

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-02-06 07:44 UTC] cmb@php.net
Might be related to bug #79175.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 19:01:29 2024 UTC