php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #43567 PCRE: compilation failure when using UTF-8
Submitted: 2007-12-11 17:02 UTC Modified: 2008-01-13 15:11 UTC
Votes:2
Avg. Score:3.0 ± 2.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: brito_victor at yahoo dot fr Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.2.5 OS: Windows XP SP2
Private report: No CVE-ID: None
 [2007-12-11 17:02 UTC] brito_victor at yahoo dot fr
Description:
------------
In a test script, I call preg_match() function, using the flag u, in order to test an UTF-8 regular expression with hexadecimal characters. Of course, the mbstring extension is loaded and active.

Reproduce code:
---------------
$mb = function_exists('mb_detect_encoding');
$pregutf8 = preg_match("/\xf8\xa1\xa1\xa1\xa1/u", "\xf8\xa1\xa1\xa1\xa1");

Expected result:
----------------
Returns true for both variables.

Actual result:
--------------
Returns true for $mb.
Returns the following warning message for $pregutf8: "Warning:  preg_match() [function.preg-match]: Compilation failed: invalid UTF-8 string at offset 0"

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-12-11 17:08 UTC] brito_victor at yahoo dot fr
Bug seen on Windows XP SP2.
 [2007-12-13 12:48 UTC] confins_de_l_univers at yahoo dot fr
I've got the same error on linux (Debian Etch).

But your pattern (and your subject) is not a valid utf-8 string.
I've checked that with : mb_check_encoding("\xf8\xa1\xa1\xa1\xa1", 'UTF-8')
and it returns false.

So, I think it's not a bug.
 [2007-12-13 12:54 UTC] brito_victor at yahoo dot fr
I think it is a bug because, when the script is tested within version 5.2.4, it returns true.
 [2008-01-13 15:11 UTC] nlopess@php.net
In PHP 5.2.5 we upgraded PCRE to version 7.3. This version has stricter UTF-8 string validation, and thats why the string is now rejected.
For the record this bug has nothing to do with the mbstring extension.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 10:01:28 2024 UTC