php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #44923 ereg functions are not unicode aware: provide wrapper functions in PCRE
Submitted: 2008-05-06 03:59 UTC Modified: 2009-10-19 15:01 UTC
From: tokul at users dot sourceforge dot net Assigned:
Status: Wont fix Package: Regexps related
PHP Version: 6CVS-2008-05-06 (snap) OS: Linux Debian Etch
Private report: No CVE-ID: None
 [2008-05-06 03:59 UTC] tokul at users dot sourceforge dot net
Description:
------------
expressions that work in older versions fail on PHP6 unicode.semantics=on

Compared 5.2-dev, 5.3-dev and 6.0-dev snapshots

Reproduce code:
---------------
$line = "* 469 EXISTS\r\n";
if (ereg("[^ ]+ +([^ ]+) +EXISTS", $line, $match)) {
    var_dump($match[1]);
} else {
    var_dump(false);
}

$line = "* 469 FETCH (UID 508 BODY[1]<0> {154}\r\n";
if (ereg('\\{([^\\}]*)\\}', $line, $match)) {
    var_dump($match[1]);
} else {
    var_dump(false);
}

Expected result:
----------------
string(3) "469"
string(3) "154"

Actual result:
--------------
bool(false)

Warning: ereg(): REG_BADRPT in /home/tomas/testbeds/test/php60/bin/ereg.php on line 10
bool(false)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-05-06 12:06 UTC] felipe@php.net
This is expected, the code isn't prepared to works with unicode strings.
Actually it only use REG_EXTENDED with binary strings, and convert the unicode string to normal string.
 [2008-08-12 16:38 UTC] jani@php.net
For unicode aware regexps use PCRE. The old ereg stuff should be provided as wrapper functions which uses PCRE underneath though.
 [2008-08-14 14:42 UTC] nlopess@php.net
PCRE and ereg_* have different syntaxes. So wrapping ereg to pcre will break most regexes.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Mon Jan 06 19:01:29 2025 UTC