php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #52118 preg_replace gives bad output if matching group equals whole string
Submitted: 2010-06-18 11:25 UTC Modified: 2010-06-18 12:04 UTC
From: tomasz dot slominski at gmail dot com Assigned:
Status: Not a bug Package: *Regular Expressions
PHP Version: Irrelevant OS: WIN XP SP3
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: tomasz dot slominski at gmail dot com
New email:
PHP Version: OS:

 

 [2010-06-18 11:25 UTC] tomasz dot slominski at gmail dot com
Description:
------------
preg replace is going mad when matching group equals to (.*). It seems that 
substitution is made 2 times instead of 1.

Test script:
---------------
var_dump(preg_replace(array("/(.*)/"), array('!$1'),'test'));
var_dump(preg_replace(array("/(.*)/"), array('$1!'),'test'));
var_dump(preg_replace(array("/(.*)/"), array('!$1!'),'test'));

Expected result:
----------------
string '!test' (length=5)
string 'test!' (length=5)
string '!test!' (length=6)


Actual result:
--------------
string '!test!' (length=6)
string 'test!!' (length=6)
string '!test!!!' (length=8)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-06-18 11:40 UTC] tomasz dot slominski at gmail dot com
Fast hack:   var_dump(preg_replace(array("/(.+)(.*)/"), array('!$1$2'),'test')); 
gives good output (!test)
 [2010-06-18 12:04 UTC] salathe@php.net
-Status: Open +Status: Bogus
 [2010-06-18 12:04 UTC] salathe@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

This is expected behaviour. When a match is found, PCRE checks for the next 
possible match starting at the point immediately after the previous match.  In 
your case, the .* first matches the entire subject string "test", then also 
matches again at very end of the string since the * quantifier allows matching 
nothing.
 [2010-06-18 13:36 UTC] tomasz dot slominski at gmail dot com
ok, but shouldn't greedy .* consume the whole string? 

 var_dump(preg_replace("/(.*)/U", '$1!','test'));
 gives
 string '!t!!e!!s!!t!!' (length=13)

and that's ok, but why 

  var_dump(preg_replace("/(.*)/", '$1!','test'));

is producing 

 string 'test!!' (matching 'test' - nothing)

instead of 

 string '!test!!'  (matching nothing - 'test' - nothing)  or string 'test!'   
(matching 'test')

it's at least counter-intuitive
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Sun Nov 28 05:03:12 2021 UTC