php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71647 Recursive (?R) regexp crashes PHP7.0-FPM
Submitted: 2016-02-22 22:31 UTC Modified: 2016-08-28 12:53 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:2 (100.0%)
From: adaur dot underground at gmail dot com Assigned: cmb (profile)
Status: Closed Package: PCRE related
PHP Version: 7.0.3 OS: Debian 8
Private report: No CVE-ID: None
 [2016-02-22 22:31 UTC] adaur dot underground at gmail dot com
Description:
------------
Hello,

Webserver: nginx

I am facing the same bug than https://wordpress.org/support/topic/swift-regexp-comments-caused-php70-fpm-to-crash

If I use a regexp that contains (?R), PHP7.0-FPM crashes.

PHP7-FPM log:

[22-Feb-2016 23:00:17] WARNING: [pool www] child 21157 said into stderr: "*** Error in `php-fpm: pool www': free(): invalid pointer: 0x00007f1dfb0aa29a ***"



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-02-22 22:34 UTC] adaur dot underground at gmail dot com
This is the regex that causes the crash (it's long, but you can skip to the end):

    're_bbcode'             => '% # re_bbcode Rev:20110220_1200
# First, match opening tag of syntax: "[TAGNAME (= ("\')ATTRIBUTE("\') )]";
\[                              # Match opening bracket of outermost opening TAGNAME tag.
(?>(%taglist%)\s*+) # $1:
(?>                             # Atomically group remainder of opening tag.
  (?:                           # Optional attribute.
    (=)\s*+                     # $2: = Optional attribute\'s equals sign delimiter, ws.
    (?:                         # Group for 1-line attribute value alternatives.
      \'([^\'\r\n\\\\]*+(?:\\\\.[^\'\r\n\\\\]*+)*+)\'  # Either $3: == single quoted,
    | "([^"\r\n\\\\]*+(?:\\\\.[^"\r\n\\\\]*+)*+)"      # or     $4: == double quoted,
    | ( [^[\]\r\n]*+            # or $5: == un-or-any-quoted. "normal*" == non-"[]"
        (?:                     # Begin "(special normal*)*" "Unrolling-the-loop" construct.
          \[[^[\]\r\n]*+\]      # Allow matching [square brackets] 1 level deep. "special".
            [^[\]\r\n]*+        # More "normal*" any non-"[]", non-newline characters.
        )*+                     # End "(special normal*)*" "Unrolling-the-loop" construct.
      )                         # End $5: Un-or-any-quoted attribute value.
    )                           # End group of attribute values alternatives.
    \s*+                        # Optional whitespace following quoted values.
  )?                            # End optional attribute group.
  \]                            # Match closing bracket of outermost opening TAGNAME tag.
)                               # End atomic group with opening tag remainder.
# Second, match the contents of the tag.
(                               # $6: Non-trimmed contents of TAGNAME tag.
  (?>                           # Atomic group for contents alternatives.
    [^\[]++                     # Option 1: Match non-tag chars (starting with non-"[").
    (?:                         # Begin "(special normal*)*" "Unrolling-the-loop" construct.
      (?!\[/?+\1[\]=\s])\[      # "special" = "[" if not start of [TAGNAME*] or [/TAGNAME].
      [^\[]*+                   # More "normal*".
    )*+                         # Zero or more "special normal*"s allowed for option 1.
  | (?:                         # or Option 2: Match non-tag chars (starting with "[").
      (?!\[/?+\1[\]=\s])\[      # "special" = "[" if not start of [TAGNAME*] or [/TAGNAME].
      [^\[]*+                   # More "normal*".
    )++                         # One or more "special normal*"s required for option 2.
  | (?R)                        # Or option 3: recursively match nested [TAGNAME]..[/TAGNAME].
  )*+                           # One of these three options as many times as necessary.
)                               # End $6: Non-trimmed contents of TAGNAME tag.
# Finally, match the closing tag.
\[/\1\s*+\]                     # Match outermost closing [/  TAGNAME  ]
                                %ix',

If I remove the following line, it works:
  | (?R)                        # Or option 3: recursively match nested
 [2016-02-22 22:36 UTC] nikic@php.net
Can you please also provide the input you're matching against?
 [2016-02-22 22:57 UTC] adaur dot underground at gmail dot com
Thank you for your answer.

As you may have guessed, this is a regexp used to parse BBCode in a forum script. Every time the parser is called, the page crashes, hence any input seems to trigger the bug.

I have tried with a simple dot (.), crashes too.
 [2016-08-20 10:36 UTC] cmb@php.net
-Status: Open +Status: Feedback -Package: *Regular Expressions +Package: PCRE related -Assigned To: +Assigned To: cmb
 [2016-08-20 10:36 UTC] cmb@php.net
This issue may be caused by pcre.jit=1. Please try with
pcre.jit=0. Also check PCRE_VERSION; 8.35 is known to cause some
issues regarding the JIT. Try with 8.38.
 [2016-08-28 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 [2016-08-28 09:41 UTC] adaur dot underground at gmail dot com
Hello,

Sorry for the delay.

Disabling the jit options does the trick. Indeed, my version of PHP7 was bundled with PRCE 8.35. I'll ask the package's maintainer to update it.
 [2016-08-28 12:53 UTC] cmb@php.net
-Status: No Feedback +Status: Closed
 [2016-08-28 12:53 UTC] cmb@php.net
Okay, so I'm closing this ticket. Please re-open if you'll find
that PCRE 8.38 won't solve the JIT related issue.
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Tue Oct 15 23:01:26 2019 UTC