php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #74675 Regex that worked fine on 5.x Fails Randomly on 7.x
Submitted: 2017-05-30 20:14 UTC Modified: 2017-05-30 20:41 UTC
From: donatj at gmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 7.1.5 OS: Darwin / Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: donatj at gmail dot com
New email:
PHP Version: OS:

 

 [2017-05-30 20:14 UTC] donatj at gmail dot com
Description:
------------
We use this regex (?(?=")"(\\"|[^"])*"|'(\\'|[^'])*') as part of our internal CI system as a step to clear strings. 

We were getting weird failures on 7.x builds and I hunted it down to the fact that this regex randomly fails on some strings in 7.x.  It works in most cases, but we have 4 files where it randomly fails and returns NULL.

It's worked fine prior in 5.x for years, and works on HHVM fine so something must have changed in PHP7.

Test script:
---------------
The issue can be seen here. https://3v4l.org/SEWZc / https://gist.github.com/donatj/b872de5cc8f755871500a82c483aa92f

I boiled the input string down as much as I could manage, it's still very long.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-05-30 20:41 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2017-05-30 20:41 UTC] requinix@php.net
PHP 7 introduced PCRE's JIT feature. When enabled this changes how regexes are compiled and executed.
http://sljit.sourceforge.net/pcre.html

preg_last_error() is returning PREG_JIT_STACKLIMIT_ERROR which means what it sounds like it means and is caused by an inefficient regex.
If I alter your regex a bit by tweaking the backslash treatment and adding +s
  '/(?(?=")"(\\\\.|[^\\\\"]+)*"|\'(\\\\.|[^\\\\\']+)*\')/'
then it executes fine on the test string (and also fixes what I assume was a bug with the original).
 [2017-05-30 21:37 UTC] donatj at gmail dot com
Your solution regex while working on the specific input actually fails on more of my inputs than the original. Sigh. I'll accept the "not a bug" verdict but I'm not happy about the JIT's seemingly uselessness at all.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 23:01:29 2024 UTC