php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #61001 Corruption of "=0a" but not "=a0"
Submitted: 2012-02-07 10:10 UTC Modified: 2012-02-07 19:39 UTC
From: mike at eastghost dot com Assigned:
Status: Closed Package: PCRE related
PHP Version: 5.3.10 OS: Ubuntu LAMP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: mike at eastghost dot com
New email:
PHP Version: OS:

 

 [2012-02-07 10:10 UTC] mike at eastghost dot com
Description:
------------
Passing following UTF8 text thru 3rd line of the test script (i.e., 
preg_replace() function) causes an error in preg_replace function:

[post=0a /]

Whereas, passing following UTF8 text similarly causes no error:

[post=a0 /]

Problem seems to be caused only when the "=" is followed by an integer then 
followed by a letter.  I briefly tried other combinations without causing error.

Workaround is to replace third line of test script with this line (i.e., use the 
preg_replace_callback() instead of preg_replace()

$out = preg_replace_callback( '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uiu', 
'debdcode_post', $i_html );




Test script:
---------------
$html_ours[0] = '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uieu';

$html_oursr[0] = 'debdcode_post( $1 )'; // irrelevant, use any misc func that looks up post id in db

$out = preg_replace( $html_ours, $html_oursr, $i_html );

Expected result:
----------------
The general use is in a BBCODE-like parser for use in a FORUMS app.

What should happen:

In the source text (in UTF-8 format),

the string "[post=4ablahblah /]"

should be picked out of any given arbitrary input

by the preg_replace()

and then translated to a hyperlink

by the debdcode_post().  What is happening instead is the error in preg_replace, 
presumably from malformed UTF-8 or possibly a bug inside preg_replace when 
dealing with the particular character sequence "=<integer><letter(s) and/or 
integer(s)>.  Note that it's the "=" followed by an integer and then followed by 
at least one letter and/or more integers that triggers the error.  I hope this 
helps; thank you for looking.

Actual result:
--------------
Parse error: syntax error, unexpected T_STRING in 
/apath/Class/common_functions.inc(1405) : regexp code on line 1

Fatal error: preg_replace() [<a href='function.preg-replace'>function.preg-
replace</a>]: Failed evaluating code: debdcode_post( 4f30abfddc79595474000020 ) in 
<file:line>

Patches

NoPatch (last revision 2012-02-07 10:11 UTC by mike at eastghost dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-02-07 10:50 UTC] anon at anon dot anon
Not a bug and it has nothing to do with UTF8. The error message says why it's not working: the eval'd code has a syntax error, because you forgot to wrap the argument to debdcode_post in quotes. It should be:

$html_oursr[0] = 'debdcode_post(\'$1\')';

It works for `debdcode_post(a0)` because a0 is parsed as a constant (if you do `error_reporting(-1);` you will see the notice about the use of the undefined constant), but `debdcode_post(0a)` is always a syntax error.

But the better (faster) solution is to use preg_replace_callback.
 [2012-02-07 19:38 UTC] mike at eastghost dot com
I tried your suggested fix and agree you are correct.  This is not a bug, just 
humantax_error.  BTW, I already changed the code to use preg_replace_callback() 
(vs what was an array of subject-regexs and replacement-strings passed to 
preg_replace) and also agree it is faster to use preg_replace_callback().  Thank 
you for looking.
 [2012-02-07 19:39 UTC] mike at eastghost dot com
-Status: Open +Status: Closed
 [2012-02-07 19:39 UTC] mike at eastghost dot com
closed
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Thu Jul 03 13:01:33 2025 UTC