php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79233 Unicode character properties not working properly
Submitted: 2020-02-06 03:55 UTC Modified: 2020-02-06 07:44 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:2 (100.0%)
From: yahiru1121 at gmail dot com Assigned:
Status: Open Package: *Regular Expressions
PHP Version: 7.4.2 OS: macOS Mojave 10.14.6
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2020-02-06 03:55 UTC] yahiru1121 at gmail dot com
Description:
------------
Combining unicode property codes with supported scripts produces unexpected results


Test script:
---------------
<?php

$str = 'aAaA亜あアア11ⅰʰ';
$regs = [
    '/[\p{Nd}\p{Han}]/u',
    '/[\p{Nl}\p{Han}]/u',
    '/[\p{Lu}\p{Han}]/u',
    '/[\p{Ll}\p{Han}]/u',
    '/[\p{Lo}\p{Han}]/u',
    '/[\p{Lm}\p{Han}]/u',
];


foreach ($regs as $reg) {
    echo $reg, ' => ', preg_replace($reg, 'x️', $str), PHP_EOL;
}


Expected result:
----------------
/[\p{Nd}\p{Han}]/u => aAaAx️あアアx️x️ⅰʰ
/[\p{Nl}\p{Han}]/u => aAaAx️あアア11x️ʰ
/[\p{Lu}\p{Han}]/u => ax️ax️x️あアア11ⅰʰ
/[\p{Ll}\p{Han}]/u => x️Ax️Ax️あアア11ⅰʰ
/[\p{Lo}\p{Han}]/u => aAaAx️x️x️x️11ⅰʰ
/[\p{Lm}\p{Han}]/u => aAaAx️あアア11ⅰx️

Actual result:
--------------
/[\p{Nd}\p{Han}]/u => aAaAx️あアア11ⅰʰ
/[\p{Nl}\p{Han}]/u => aAaAx️あアア11ⅰʰ
/[\p{Lu}\p{Han}]/u => aAaAx️あアアx️x️ⅰʰ
/[\p{Ll}\p{Han}]/u => ax️ax️x️あアア11ⅰʰ
/[\p{Lo}\p{Han}]/u => aAaAx️あアア11ⅰʰ
/[\p{Lm}\p{Han}]/u => aAaAx️あアア11ⅰʰ

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-02-06 07:44 UTC] cmb@php.net
Might be related to bug #79175.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Mon Feb 24 12:01:25 2020 UTC