PHP :: Bug #61001 :: Corruption of "=0a" but not "=a0"

Bug #61001	Corruption of "=0a" but not "=a0"
Submitted:	2012-02-07 10:10 UTC	Modified:	2012-02-07 19:39 UTC
From:	mike at eastghost dot com	Assigned:
Status:	Closed	Package:	PCRE related
PHP Version:	5.3.10	OS:	Ubuntu LAMP
Private report:	No	CVE-ID:	None

View Developer Edit

Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !

Your email address: MUST BE VALID
Solve the problem: 27 - 20 = ?
Subscribe to this entry?

[2012-02-07 10:10 UTC] mike at eastghost dot com

Description:
------------
Passing following UTF8 text thru 3rd line of the test script (i.e., 
preg_replace() function) causes an error in preg_replace function:

[post=0a /]

Whereas, passing following UTF8 text similarly causes no error:

[post=a0 /]

Problem seems to be caused only when the "=" is followed by an integer then 
followed by a letter.  I briefly tried other combinations without causing error.

Workaround is to replace third line of test script with this line (i.e., use the 
preg_replace_callback() instead of preg_replace()

$out = preg_replace_callback( '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uiu', 
'debdcode_post', $i_html );




Test script:
---------------
$html_ours[0] = '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uieu';

$html_oursr[0] = 'debdcode_post( $1 )'; // irrelevant, use any misc func that looks up post id in db

$out = preg_replace( $html_ours, $html_oursr, $i_html );

Expected result:
----------------
The general use is in a BBCODE-like parser for use in a FORUMS app.

What should happen:

In the source text (in UTF-8 format),

the string "[post=4ablahblah /]"

should be picked out of any given arbitrary input

by the preg_replace()

and then translated to a hyperlink

by the debdcode_post().  What is happening instead is the error in preg_replace, 
presumably from malformed UTF-8 or possibly a bug inside preg_replace when 
dealing with the particular character sequence "=<integer><letter(s) and/or 
integer(s)>.  Note that it's the "=" followed by an integer and then followed by 
at least one letter and/or more integers that triggers the error.  I hope this 
helps; thank you for looking.

Actual result:
--------------
Parse error: syntax error, unexpected T_STRING in 
/apath/Class/common_functions.inc(1405) : regexp code on line 1

Fatal error: preg_replace() [<a href='function.preg-replace'>function.preg-
replace</a>]: Failed evaluating code: debdcode_post( 4f30abfddc79595474000020 ) in 
<file:line>

Patches

NoPatch (last revision 2012-02-07 10:11 UTC by mike at eastghost dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2012-02-07 10:50 UTC] anon at anon dot anon

Not a bug and it has nothing to do with UTF8. The error message says why it's not working: the eval'd code has a syntax error, because you forgot to wrap the argument to debdcode_post in quotes. It should be:

$html_oursr[0] = 'debdcode_post(\'$1\')';

It works for `debdcode_post(a0)` because a0 is parsed as a constant (if you do `error_reporting(-1);` you will see the notice about the use of the undefined constant), but `debdcode_post(0a)` is always a syntax error.

But the better (faster) solution is to use preg_replace_callback.

[2012-02-07 19:38 UTC] mike at eastghost dot com

I tried your suggested fix and agree you are correct.  This is not a bug, just 
humantax_error.  BTW, I already changed the code to use preg_replace_callback() 
(vs what was an array of subject-regexs and replacement-strings passed to 
preg_replace) and also agree it is faster to use preg_replace_callback().  Thank 
you for looking.

[2012-02-07 19:39 UTC] mike at eastghost dot com

-Status: Open +Status: Closed

[2012-02-07 19:39 UTC] mike at eastghost dot com

closed

	php.net \| support \| documentation \| report a bug \| advanced search \| search howto \| statistics \| random bug \| login
go to bug id or search bugs for


Copyright © 2001-2025 The PHP Group All rights reserved.	Last updated: Tue Jul 08 10:01:33 2025 UTC