php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #29802 Glitch using preg_match and preg_replace
Submitted: 2004-08-23 20:34 UTC Modified: 2004-09-13 04:31 UTC
Votes:3
Avg. Score:4.3 ± 0.9
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:2 (100.0%)
From: wesleygoku at yahoo dot com dot br Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 4.3.9RC2 OS: Windows 2000 SP4
Private report: No CVE-ID: None
 [2004-08-23 20:34 UTC] wesleygoku at yahoo dot com dot br
Description:
------------
I'm trying to use preg_replace to evaluate my own XHTML tags, sending their arguments and contents to another funcion. There are two uses, the simple, without content (like <wstag x="y" />), and the complex, with content (like <wstag x="y">x</wstag>). Each one is parsed by their own function, and to evaluate ALL the complex tags (even the tags inside other tags) I'm using a while condition.

The problem is very odd, it happens when I use more than one complex tag, and some simple tag before the complex ones, all of this inside another complex tag (you will understand better reading the code), with few complex lines or without simple tags before them, the results are good, but otherwise, the while condition just doesn't work and the Apache thread takes much more time proccessing the script (I don't know why, as the while code just repeats two times).

Look at the URL below for more details and note the first code should work (try removing the second one), but not the second!

Reproduce code:
---------------
http://wstec.net/tmp/php_bug_pcre.html

Expected result:
----------------
<pre>NEO_HTML( &quot;list&quot;, &quot;action=&quot;/?e=2&amp;d=lay&quot;&quot;, &quot;
	&lt;wsimg src=&quot;&quot; /&gt;
	&lt;wsimg src=&quot;&quot; /&gt;
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;Diret?rio anterior&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;&lt;b&gt;Diret?rio anterior&lt;/b&gt;&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;&quot; target=&quot;_blank&quot;&quot;, &quot;Diret?rio&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;css.css&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/css.css&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;css.css&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/css.css&quot; target=&quot;_blank&quot;&quot;, &quot;Estilo CSS&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/css.css&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;3,23 KB&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;ico.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/ico.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;ico.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/ico.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/ico.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;66 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;list_order_asc.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/list_order_asc.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;list_order_asc.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_asc.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_asc.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;61 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;list_order_desc.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/list_order_desc.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;list_order_desc.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_desc.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_desc.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;61 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;top_admin.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/top_admin.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;top_admin.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_admin.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_admin.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;1,17 KB&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;top_bg.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/top_bg.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;top_bg.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_bg.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_bg.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;149 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;top_logo.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/top_logo.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;top_logo.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_logo.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_logo.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;2,08 KB&quot; )&quot; )
	&lt;wsfooter search=&quot;&quot; action=&quot;move|Mover&quot; action=&quot;copy|Copiar&quot; action=&quot;remove|Remover|3&quot; /&gt;
&quot; )</pre>

Actual result:
--------------
<pre>NEO_HTML( &quot;list&quot;, &quot;action=&quot;/?e=2&amp;d=lay&quot;&quot;, &quot;
	&lt;wsimg src=&quot;&quot; /&gt;
	&lt;wsimg src=&quot;&quot; /&gt;
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;Diret?rio anterior&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;&lt;b&gt;Diret?rio anterior&lt;/b&gt;&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;&quot; target=&quot;_blank&quot;&quot;, &quot;Diret?rio&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;css.css&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/css.css&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;css.css&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/css.css&quot; target=&quot;_blank&quot;&quot;, &quot;Estilo CSS&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/css.css&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;3,23 KB&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;ico.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/ico.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;ico.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/ico.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/ico.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;66 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;list_order_asc.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/list_order_asc.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;list_order_asc.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_asc.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_asc.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;61 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;list_order_desc.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/list_order_desc.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;list_order_desc.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_desc.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/list_order_desc.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;61 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;top_admin.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/top_admin.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;top_admin.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_admin.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_admin.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;1,17 KB&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;top_bg.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/top_bg.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;top_bg.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_bg.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_bg.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;149 Bytes&quot; )&quot; )
	NEO_HTML( &quot;item&quot;, &quot;id=&quot;top_logo.gif&quot;&quot;, &quot;NEO_HTML( &quot;col&quot;, &quot;width=&quot;284&quot; href=&quot;/top_logo.gif&quot; target=&quot;_blank&quot;&quot;, &quot; &lt;img width=&quot;10&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; a=&quot;1&quot; /&gt; &quot; )NEO_HTML( &quot;col&quot;, &quot;width=&quot;10&quot;&quot;, &quot;top_logo.gif&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_logo.gif&quot; target=&quot;_blank&quot;&quot;, &quot;Imagem&quot; )NEO_HTML( &quot;col&quot;, &quot;href=&quot;/top_logo.gif&quot; target=&quot;_blank&quot; align=&quot;right&quot;&quot;, &quot;2,08 KB&quot; )&quot; )
	&lt;wsfooter search=&quot;&quot; action=&quot;move|Mover&quot; action=&quot;copy|Copiar&quot; action=&quot;remove|Remover|3&quot; /&gt;
&quot; )</pre>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-08-24 08:38 UTC] derick@php.net
Please provide a very short example without endless pieces of HTML/XML code.
 [2004-08-24 18:08 UTC] wesleygoku at yahoo dot com dot br
This is the problem, if I reduce the code, it works. I've checked the XHTML code and the Regular Expression, but everything is fine.
 [2004-08-26 22:25 UTC] wesleygoku at yahoo dot com dot br
To clarify the bug, let's explain it better: I'm using preg_replace to replace some code. If this code is bigger than an unknown lenght, something very weird happens.

I'm using while to replace ALL of the occurrences in some string, but the preg_match used to verify if the text still appears just tells the code DOESN'T match, although it does!

Now, see the problem:

This doesn't work, note that many "<ws..." are still there
Source code: http://wstec.net/tmp/php_bug_pcre/code_01.html
Results: http://wstec.net/tmp/php_bug_pcre/code_01.php

Now, without the first two <wsimg>, id works as it should
Source code: http://wstec.net/tmp/php_bug_pcre/code_02.html
Results: http://wstec.net/tmp/php_bug_pcre/code_02.php

I ask you to take a good look at this bug, because I need a resolution about it to keep working on my script. And also note that it isn't with my code, but with the PHP's funcion preg_match and preg_replace.
 [2004-09-06 15:48 UTC] wesleygoku at yahoo dot com dot br
Updated to the latest PHP build, but the problem remains...
 [2004-09-10 13:10 UTC] nlopess@php.net
From what I could read, the problem is in your regex.
 [2004-09-13 00:06 UTC] wesleygoku at yahoo dot com dot br
The problem ISN'T in my regex. If you see my post from 26 Aug 10:25pm CEST, you'll notice this, see: the problem ONLY happens with many "tags", if I reduce it (and I can't do this in the original script), it doesn't happen! How can this kind of behaviour be something in MY regex?!
 [2004-09-13 01:28 UTC] wesleygoku at yahoo dot com dot br
Please remove this bug as nobody wants to understand it.
I'll try to explain it better in another bug report...
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 08:01:28 2024 UTC