php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #41749 Reproducible segfault in PCRE lib
Submitted: 2007-06-20 14:40 UTC Modified: 2007-06-21 09:40 UTC
From: joe at emomentum dot co dot uk Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.2.0 OS: Debian Etch (Debian 4.0 Stable)
Private report: No CVE-ID: None
 [2007-06-20 14:40 UTC] joe at emomentum dot co dot uk
Description:
------------
Couldn't see this anywhere else (similar but not close enough).

Located an apparent bug in the PCRE library, although this might be relating to the way PHP calls the library (I'll post this to the PCRE list as well).

Reproducable if slightly random crash occurs when using regex's with certain hex strings on longish (and random) strings.

Weirdly, the length of the string directly relates to the chance of a segfault, and the segfault only occurs with certain ranges of hex strings (specifically, ONLY over x7A and ONLY with text strings of exactly 4843 bytes or longer).

Note that using the regex /^([\x00-\x7A])*$/ causes a segfault, whereas /^([\x00-\x71])*$/ or /^([\x00-\x79])*$/ does not.

Running on Debian Etch 64bit (amd64) with latest stable PHP and libpcre3_6.7-1_amd64 installed.

Regards,

Joe Harris
Senior Developer
eMomentum Limited

Reproduce code:
---------------
<?php

/* the length of the string determines the chance of a segfault. */
$strlen = 4846;   /* almost total segfault, roughly 100% segfaults*/
//$strlen = 4845; /* almost always segfault, roughly 95% segfaults */
//$strlen = 4844; /* mostly segfault, roughly 80% segfaults */
//$strlen = 4843; /* regularly segfault, roughly 30% segfaults */
//$strlen = 4842; /* run without error, roughly 0% segfaults */

$alphabet = range('a', 'z');  /* range of lowercase letters */

$str = null;                  /* generate the random string */
for($i = 0; $i < $strlen; $i++) { $str .= $alphabet[rand(0,25)]; }

/* perform our regex of doom */
$result = preg_match('/^([\x00-\x7A])*$/', $str);

/* spam our (what should be) boolean result */
var_dump($result);

?>


Expected result:
----------------
int(0)

(false, never going to match a-z random string)

Actual result:
--------------
Segmentation fault (core dumped)

-----

when running in gdb:

This GDB was configured as "x86_64-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) run test.php
Starting program: /usr/bin/php test.php
(no debugging symbols found)
     [snip - lots of these]
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 47782002024432 (LWP 11134)]
(no debugging symbols found)
     [snip - lots of these]
(no debugging symbols found)
testing a string of 4846 bytes in length
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 47782002024432 (LWP 11134)]
0x00002b751be871a8 in pcre_dfa_exec () from /usr/lib/libpcre.so.3


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-06-20 21:06 UTC] judas dot iscariote at gmail dot com
your code produces stack overflow in the PCRE library and there is nothing almost nothing that PHP can do to avoid that.
 [2007-06-20 21:44 UTC] nlopess@php.net
We don't use the pcre_dfa_exec() function. There's something wrong going there. Still this is not a PHP bug.
libpcre 6.7 is really old and you should consider upgrading to libpcre 7.2.
BTW, I couldn't reproduce the problem, albeith with a newer libpcre version.
 [2007-06-21 09:40 UTC] joe at emomentum dot co dot uk
Thanks for the feedback, was just making sure it wasn't anything PHP-related. I've submitted a Debian bug requesting a package upgrade.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 22:01:29 2024 UTC