php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #54506 Regex Unicode problem
Submitted: 2011-04-11 16:39 UTC Modified: 2011-04-11 20:47 UTC
From: chsavio at gmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.3SVN-2011-04-08 (snap) OS: Centos 2.6.18
Private report: No CVE-ID: None
 [2011-04-11 16:39 UTC] chsavio at gmail dot com
Description:
------------
Combinations of unicode characters in brackets appear to match a completely incorrect character. We're using PHP 5.3.3 (cli). And I was able to reproduce the problem on a PHP regex test site.

http://www.pagecolumn.com/tool/pregtest.htm

Test script:
---------------
http://www.pagecolumn.com/tool/pregtest.htm

<?php
$ptn = "/[ÜŸ]/";
$str = "ø";
preg_match($ptn, $str, $matches);
print_r($matches);
?>

Should be the equivalent of the following, but is not

<?php
$ptn = "/Ü|Ÿ/";
$str = "ø";
preg_match($ptn, $str, $matches);
print_r($matches);
?>

Expected result:
----------------
I'd expect no matches.

Array
(
)

Actual result:
--------------
I get a match.

Array
(
    [0] => ø
)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2011-04-11 16:52 UTC] johannes@php.net
-Status: Open +Status: Bogus
 [2011-04-11 16:52 UTC] johannes@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Use the /u modifier.
 [2011-04-11 20:47 UTC] chsavio at gmail dot com
Missed the PCRE modifiers section of the documentation. Sorry about that. Thanks for your time.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 01:01:30 2024 UTC