|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #30071 Many problems with Perl-Compatible Regular Expressions
Submitted: 2004-09-13 01:57 UTC Modified: 2004-09-13 06:37 UTC
From: wesleywex at gmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 4.3.9RC2 OS: Windows 2000 SP4
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Bug Type:
From: wesleywex at gmail dot com
New email:
PHP Version: OS:


 [2004-09-13 01:57 UTC] wesleywex at gmail dot com
I'm trying to replace some multiple tags like <wsb></wsb> and <wsimg> with only one regex. While I was using few tags, I saw no problem (what indicates that the regex is valid), but when the tags reaches some unknown limit, the system makes very odd things, like:

- Not parsing correctly the text
- Taking much more time to proccess the script if just one small new line is inserted

You can see theese glitches in 3 files I've prepared. All 3 have similar contents, I just changed one line between them.

Reproduce code:

Actual result:
Attention to the line with the <wsx />

- Took more than 1 second to parse the text only 2 times
- Parsed everything as it should be parsed
- Note that there aren't spaces in the text among <!-- and -->

- Took more than 1 second to parse the text only 1 time
- Not everything that should be parsed was
- Note the only difference is the spaces in the text among <!-- and -->

- Took less than 1 second to parse the text the same 2 times than the firs scipt
- Parsed everything as it should be parsed
- Note that the only difference between the previous scripts is that line starting with <wsx />

I hope you understand the issue now


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2004-09-13 04:29 UTC]
It's not that hard to write recursive regular expressions that brings the regex parser to its knees.  As far as I can tell this is what you have done.  But it is hard to tell as you don't even explain what you are expecting from the simple case of something like <wsb>abc</wsb>
Plugging that into your regex gets me:
NEO_HTML( &quot;b&quot;, &quot;&quot;, &quot;abc&quot; )
Is that really what you are trying to get?

If you really want to get to the bottom of it, grab PCRE from and plug your regular expression into PCRE directly using the provided pcredemo.c.  This way you won't be using any PHP code.  If it works perfectly, come back and say so.  If it still shows problems, file a bug with PCRE.
 [2004-09-13 05:39 UTC] wesleywex at gmail dot com
Yes, the real regex is a function (in fact it's $this->output_neohtml( ... )), and that IS what I expect, but this is just what I DON'T get in this example.

I thought PHP would help me solve this bug, but, as you told me, I must report it to the PCRE creators. Can you say where? Because I didn't find anything on the link you sent.
 [2004-09-13 06:37 UTC]
Report PCRE bugs here (I guess):

Since it wasn't a bug in PHP, I'm marking this bogus.
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Thu May 19 22:05:46 2022 UTC