php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #26469 Reproducible crash on regexp
Submitted: 2003-11-29 20:13 UTC Modified: 2003-11-30 04:12 UTC
From: f23wop602 at sneakemail dot com Assigned:
Status: Not a bug Package: Reproducible crash
PHP Version: 4.3.4 OS: RH7 2.2.16-22
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: f23wop602 at sneakemail dot com
New email:
PHP Version: OS:

 

 [2003-11-29 20:13 UTC] f23wop602 at sneakemail dot com
Description:
------------
We had some crashes, and after some tracking we found a regexp which crashes on specific data. The data is contained in the script in the URL below, I cut it down as much as I could while still triggering the crash.

Reproduce code:
---------------
http://test.wikipedia.org/crash-php4.3.4.txt

Most of it is just data, the crash occurs on the regexp on the data at the end.

Actual result:
--------------
(gdb) bt
#0  0x808225c in match (
    eptr=0x81b4ec0 "kuterat: Enligt ''Miller'' (53) kommer en Psi-liknande form (&Psi;) av kappa fr?n Proto-Kanaaneiska. Kappa stod troligtvis f?r /k/ s?v?l som /k_h/ i tidig grekisk ortografi och senare ?terinf?rdes den"..., ecode=0x81b5e56 "8????\177???????", '?' <repeats 20 times>, "=", offset_top=4, md=0xbfffca50, ims=0,
    eptrb=0xbf800198, flags=2) at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:4986
#1  0x808236e in match (
    eptr=0x81b4ec0 "kuterat: Enligt ''Miller'' (53) kommer en Psi-liknande form (&Psi;) av kappa fr?n Proto-Kanaaneiska. Kappa stod troligtvis f?r /k/ s?v?l som /k_h/ i tidig grekisk ortografi och senare ?terinf?rdes den"..., ecode=0x81b5e53 "M", offset_top=4, md=0xbfffca50, ims=0, eptrb=0xbf800198, flags=2)
    at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:5059
#2  0x8082eb0 in match (
    eptr=0x81b4ec0 "kuterat: Enligt ''Miller'' (53) kommer en Psi-liknande form (&Psi;) av kappa fr?n Proto-Kanaaneiska. Kappa stod troligtvis f?r /k/ s?v?l som /k_h/ i tidig grekisk ortografi och senare ?terinf?rdes den"..., ecode=0x81b5e7e "?", offset_top=4, md=0xbfffca50, ims=0, eptrb=0xbf800678, flags=2)
    at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:5583

[snip]

#11143 0x808236e in match (
    eptr=0x81b38fd ">\r\n\t<td>[[Rho]]</td>\r\n\t<td>[rO:]</td>\r\n\t<td>[ro]</td>\r\n\t<td>&nbsp;</td>\r\n\t<td>[r]</td>\r\n\t<td>[r]</td>\r\n\t<td>100</td>\r\n\t<td>&#x5e8; Resh</td>\r\n\t<td>&amp;rho;</td></tr>\r\n<tr><td>&Sigma; &sigma;</td>\r\n\t<"..., ecode=0x81b5e53 "M", offset_top=4, md=0xbfffca50, ims=0,
    eptrb=0xbfea1838, flags=2) at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:5059
#11144 0x8082eb0 in match (
    eptr=0x81b38fd ">\r\n\t<td>[[Rho]]</td>\r\n\t<td>[rO:]</td>\r\n\t<td>[ro]</td>\r\n\t<td>&nbsp;</td>\r\n\t<td>[r]</td>\r\n\t<td>[r]</td>\r\n\t<td>100</td>\r\n\t<td>&#x5e8; Resh</td>\r\n\t<td>&amp;rho;</td></tr>\r\n<tr><td>&Sigma; &sigma;</td>\r\n\t<"..., ecode=0x81b5e7e "?", offset_top=4, md=0xbfffca50, ims=0,
    eptrb=0xbfea1d18, flags=2) at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:5583
#11145 0x808236e in match (
    eptr=0x81b38fc "d>\r\n\t<td>[[Rho]]</td>\r\n\t<td>[rO:]</td>\r\n\t<td>[ro]</td>\r\n\t<td>&nbsp;</td>\r\n\t<td>[r]</td>\r\n\t<td>[r]</td>\r\n\t<td>100</td>\r\n\t<td>&#x5e8; Resh</td>\r\n\t<td>&amp;rho;</td></tr>\r\n<tr><td>&Sigma; &sigma;</td>\r\n\t"..., ecode=0x81b5e53 "M", offset_top=4, md=0xbfffca50, ims=0,
    eptrb=0xbfea1d18, flags=2) at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:5059
#11146 0x8082eb0 in match (
    eptr=0x81b38fc "d>\r\n\t<td>[[Rho]]</td>\r\n\t<td>[rO:]</td>\r\n\t<td>[ro]</td>\r\n\t<td>&nbsp;</td>\r\n\t<td>[r]</td>\r\n\t<td>[r]</td>\r\n\t<td>100</td>\r\n\t<td>&#x5e8; Resh</td>\r\n\t<td>&amp;rho;</td></tr>\r\n<tr><td>&Sigma; &sigma;</td>\r\n\t"..., ecode=0x81b5e7e "?", offset_top=4, md=0xbfffca50, ims=0,
    eptrb=0xbfea21f8, flags=2) at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:5583
#11147 0x808236e in match (
    eptr=0x81b38fb "td>\r\n\t<td>[[Rho]]</td>\r\n\t<td>[rO:]</td>\r\n\t<td>[ro]</td>\r\n\t<td>&nbsp;</td>\r\n\t<td>[r]</td>\r\n\t<td>[r]</td>\r\n\t<td>100</td>\r\n\t<td>&#x5e8; Resh</td>\r\n\t<td>&amp;rho;</td></tr>\r\n<tr><td>&Sigma; &sigma;</td>\r\n"..., ecode=0x81b5e53 "M", offset_top=4, md=0xbfffca50, ims=0, eptrb=0xbfea21f8,
    flags=2) at /tmp/php-4.3.4/ext/pcre/pcrelib/pcre.c:5059
#11148 0x8082eb0 in match (
    eptr=0x81b38fb "td>\r\n\t<td>[[Rho]]</td>\r\n\t<td>[rO:]</td>\r\n\t<td>[ro]</td>\r\n\t<td>&nbsp;</td>\r\n\t<td>[r]</td>\r\n\t<td>[r]</td>\r\n\t<td>100</td>\r\n\t<td>&#x5e8; Resh</td>\r\n\t<td>&amp;rho;</td></tr>\r\n<tr><td>&Sigma; &sigma;</td>\r\n"..., ecode=0x81b5e7e "?", offset_top=4, md=0xbfffca50, ims=0, eptrb=0xbfea26d8,

And so on... You get the idea

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-11-30 04:12 UTC] sniper@php.net
Not PHP bug. You just run into the documented PCRE limitations.. check http://www.pcre.org/pcre.txt for topic "LIMITATIONS".



 [2003-11-30 07:14 UTC] brion at pobox dot com
The given script works for me without crashing on 4.3.2 (Linux/x86, Red Hat 7.3 and Red Hat 9.0, and MacOS X 10.3.1). The original poster has told me that it works for him in 4.3.4 beta and crashes only in 4.3.4 release.

Anyway, it's okay for the regexp to _fail_ if you're pushing the limits, but returning an error code would be nicer that smashing the stack and segfaulting. In web applications with possibly untrusted data this can be a security risk, and that's PHP's primary use.

Might there be a way to get PCRE to handle such conditions gracefully?
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jul 16 20:01:32 2025 UTC