php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #29427 preg_match offset not equivalent to substr
Submitted: 2004-07-28 13:29 UTC Modified: 2004-12-06 16:00 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: x-g at monkeyblah dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 4.3.8 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: x-g at monkeyblah dot com
New email:
PHP Version: OS:

 

 [2004-07-28 13:29 UTC] x-g at monkeyblah dot com
Description:
------------
According to the manual, passing an offset to preg_match is equivalent to passing substr($string, $offset) to the function. This is not the case, however; regular expressions that match on beginning-of-string will not match if an offset is specified, but work fine if substr() is used in a supposedly equivalent manner.

Either this is a problem with regular expressions giving unexpected behaviour, or perhaps the manual just need to be changed to reflect the difference.


Reproduce code:
---------------
	$string = "abc def";
	if (preg_match("/^[a-zA-Z]+/", $string, $matches, 0, 4))
		echo "Matches\n";
	else
		echo "Does not match\n";
	if (preg_match("/^[a-zA-Z]+/", substr($string, 4), $matches, 0))
		echo "Matches\n";
	else
		echo "Does not match\n";

Expected result:
----------------
Matches
Matches


Actual result:
--------------
Does not match
Matches

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-08-16 05:35 UTC] skissane at iips dot mq dot edu dot au
The behaviour the manual implies is far more useful for my application. This is because I need to repeatedly match the beginning of substrings, but using substr(...) results in excessive memory usage, especially when the strings are very large.

I would encourage this bug to be fixed ASAP.
 [2004-08-16 05:36 UTC] skissane at iips dot mq dot edu dot au
Also, confirmed this bugs existence in PHP 5.0.0.
 [2004-12-06 16:00 UTC] tony2001@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

That's expected behaviour, because you're trying to match from the beginning of the string and using offset in the same time. 
Take a look at the example below.

Following 2 lines work:
preg_match("/[a-zA-Z]+/", $string, $matches, 0, 1)
preg_match("/[a-zA-Z]+/", substr($string, 1), $matches, 0)

and here second line will work:
preg_match("/^[a-zA-Z]+/", $string, $matches, 0, 1)
preg_match("/^[a-zA-Z]+/", substr($string, 1), $matches, 0)
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Sat May 28 21:05:45 2022 UTC