php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #34947 repeated character in output of preg_replace
Submitted: 2005-10-21 16:41 UTC Modified: 2005-10-23 15:25 UTC
From: cold_candor at hotmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.0.5 OS: windows XP Pro
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: cold_candor at hotmail dot com
New email:
PHP Version: OS:

 

 [2005-10-21 16:41 UTC] cold_candor at hotmail dot com
Description:
------------
I can't imagine any reason why this should happen, and all the documentation I've read says it shouldn't, so here you go:

If, when using the preg_replace function, I attempt to match against an entire line to remove leading and ending whitespace (using ^ and $ but no modifiers, like D or m), the function does what I'd expect.  However, if I neglect to use the leading ^, the function still matches what I'd expect it to, but the replacement string is entered twice!  Even if I don't use the backreference in the replacement, all non-reference related characters are still repeated!

If I split the replacement to deal with leading and ending whitespace seperately, it always repeats the replacement string!

I did not configure anything special when I downloaded PHP, I simply grabbed the windows zip file provided on the website (www.php.net), opened it, added the path to my environment variables, and started using it.

I know of nothing special about my setup.

I have made no changes to the PHP.ini file

If it turns out that this is not a bug, don't tell me to use the damn support page, everything on there I either can't do or have tried with no result or have no entries relating to my problem or (what I really want) have no way to actually ask a question.  Since you have to figure it out anyway, just tell me what went wrong!

Reproduce code:
---------------
<?php
$text = "   a b c   ";
$newText = preg_replace("/\s*(.*?)\s*$/", "$1\n", $text);
for($i = 0; $i < strlen($newText); $i++) {
  echo ord($newText[$i]), '~';
} // End for loop
echo "\n$newText";
$newText = preg_replace("/\s*(.*?)\s*$/", "  $1\n gh ", $text);
for($i = 0; $i < strlen($newText); $i++) {
  echo ord($newText[$i]), '~';
} // End for loop
echo "\n$newText";
$newText = preg_replace("/^\s*/", "", $text);
$newText = preg_replace("/\s*$/", "\n", $newText);
for($i = 0; $i < strlen($newText); $i++) {
  echo ord($newText[$i]), '~';
} // End for loop
echo "\n$newText";
?>

Expected result:
----------------
The following three outputs should be produced (labels added for readability):

ASCII values:  97~32~98~32~99~10~
Viewed output:  "a b c
"

ASCII values:  32~32~97~32~98~32~99~10~32~103~104~32
Viewed output:  "  a b c
 gh "

ASCII values:  97~32~98~32~99~10~
Viewed output:  "a b c
"

Actual result:
--------------
The following three outputs are what was actually produced (labels added for readability):

ASCII values:  97~32~98~32~99~10~10~
Viewed output:  "a b c

"

ASCII values:  32~32~97~32~98~32~99~10~32~103~104~32~32~32~10~32~103~104~32~
Viewed output:  "  a b c
 gh
 gh"

ASCII values:  97~32~98~32~99~10~10~
Viewed output:  "a b c

"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-10-23 15:25 UTC] tony2001@php.net
* stands for "0 or more", so "\s*(.*?)\s*" matches "" (empty string) too.
Not PHP problem.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 15 06:01:30 2025 UTC