|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2003-06-01 22:38 UTC] stuge-phpbugs at cdy dot org
Verified with 4.3.0 and 4.3.2rc2 but haven't tried 4.3.2. NEWS for 4.3.2 do not mention any *reg_replace changes. Reproducible with eregi_replace().
<?php
$f="xyz";
$g=ereg_replace("z?$","a",$f);
$h=preg_replace("/z?$/","a",$f);
print $f; /* xyz */
print " ".$g; /* xyaa, I want xya */
print " ".$h; /* xyaa, I want xya */
?>
At least in CVS php4/ext/standard/reg.c I believe the problem is that regexec() on line 306 will be called a second time after the first iteration has already matched to the end of the string.
Matching from the end of the string should only be allowed on the very first iteration.
If this analysis is correct, the following patch might work, but I haven't tried it.
--- reg.c.org 2003-06-02 05:23:51.000000000 +0200
+++ reg.c 2003-06-02 05:26:06.000000000 +0200
@@ -302,7 +302,7 @@
err = pos = 0;
buf[0] = '\0';
- while (!err) {
+ do {
err = regexec(&re, &string[pos], re.re_nsub+1, subs, (pos ? REG_NOTBOL : 0));
if (err && err != REG_NOMATCH) {
@@ -396,7 +396,7 @@
/* stick that last bit of string on our output */
strcat(buf, &string[pos]);
}
- }
+ } while(!err && string[pos]);
/* don't want to leak memory .. */
efree(subs);
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Nov 07 06:00:01 2025 UTC |
$ echo xyz|perl -e 'while(<STDIN>) { $_=~ s/z?$/a/;print}' xya $ echo xyz|perl -e 'while(<STDIN>) { $_=~ s/z?$/a/g;print}' xyaa a$ echo xyz|sed -e 's/z\?$/a/g' xya $ echo xyz|sed -e 's/z\?$/a/' xya perl and sed both behave the way I would expect. [ep]reg_replace() don't. :) A comment from php.net/manual/en/pcre.pattern.modifiers.php: /g is default in PHP. You don't need to set it. preg_replace() similarity with perl may not be desired, I only included it because it showed the exact same behaviour while using a different regex code base, indicating the same algorithm. I believe Andrei's comment in 23903 is a bit quick, at least if this is a dupe, which I would say it is. Again, I believe the error to be that the regexp is "ran" one extra time after all of the input string has already been matched and processed inside [ep]reg_replace(). Because of this, any regexp that matches the empty string will cause one extra replacement to occur at the end of the input string. At least inside ereg_replace() the loop should just quit if the entire string has been processed after the first iteration. I'll try my patch and recomment.Ok. This patch works in that it gives me the expected results. I modified it slightly to care less about bytes in the string and more about the length variables, that way it's at least widechar-prepared. Woo-hoo! :) Currently running make test.. All reg tests PASS, I've also added a test for this bug. Here's the patch against current CVS: ---8<--- diff -ur php4.cvs/ext/pcre/php_pcre.c php4.new/ext/pcre/php_pcre.c --- php4.cvs/ext/pcre/php_pcre.c 2003-06-07 03:51:13.000000000 +0200 +++ php4.new/ext/pcre/php_pcre.c 2003-06-07 03:52:02.000000000 +0200 @@ -797,7 +797,7 @@ *result_len = 0; start_offset = 0; - while (1) { + do { /* Execute the regular expression. */ count = pcre_exec(re, extra, subject, subject_len, start_offset, exoptions|g_notempty, offsets, size_offsets); @@ -933,7 +933,7 @@ /* Advance to the next piece. */ start_offset = offsets[1]; - } + } while(start_offset < subject_len); efree(offsets); diff -ur php4.cvs/ext/standard/reg.c php4.new/ext/standard/reg.c --- php4.cvs/ext/standard/reg.c 2003-06-07 03:49:19.000000000 +0200 +++ php4.new/ext/standard/reg.c 2003-06-07 03:50:43.000000000 +0200 @@ -302,7 +302,7 @@ err = pos = 0; buf[0] = '\0'; - while (!err) { + do { err = regexec(&re, &string[pos], re.re_nsub+1, subs, (pos ? REG_ NOTBOL : 0)); if (err && err != REG_NOMATCH) { @@ -396,7 +396,7 @@ /* stick that last bit of string on our output */ strcat(buf, &string[pos]); } - } + } while(!err && pos < string_len); /* don't want to leak memory .. */ efree(subs); --->8--- And ext/standard/tests/reg/017.phpt: ---8<--- --TEST-- empty expression replaced extra time at end-of-string --POST-- --GET-- --FILE-- <?php echo ereg_replace('.?$',"a","xyz")?> --EXPECT-- xya --->8--- I make no warranties whatsoever that this will not break things, although I don't think it should. :)