php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #75133 When preg_replace fails, please write to error_log
Submitted: 2017-08-28 18:33 UTC Modified: 2017-08-30 12:14 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: php at richardneill dot org Assigned: cmb (profile)
Status: Duplicate Package: PCRE related
PHP Version: 7.0.22 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: php at richardneill dot org
New email:
PHP Version: OS:

 

 [2017-08-28 18:33 UTC] php at richardneill dot org
Description:
------------
It's very rare that preg_replace() fails (i.e. returns null). But when it happens, it's really unexpected, and therefore particularly hard to debug. All the normal tools show nothing helpful.

Can I request that whenever preg_replace (and preg_replace_callback) return a  null, that preg_last_error() is printed to the error_log, at least at E_NOTICE?







Test script:
---------------
It may be helpful to give an example. This code generates a 
PREG_JIT_STACKLIMIT_ERROR, with only ~ 5kB of data in $contents

$contents = 
  preg_replace_callback ('/^\s*\%LOOP_(\d+|(ITR)(\d))\s*\n((((?!%LOOP_ITR).)*\n)*)\s*\%END_LOOP\s*\n/mU',
 'repeat_n_times', $contents);

Expected result:
----------------
It shouldn't be possible to write a 1-line regexp which hits resource-limits on one screenful of text. But if we do, it would be really helpful if there were some error message in the log files.

Actual result:
--------------
PHP Notice:  preg_replace experienced a PREG_JIT_STACKLIMIT_ERROR in (filename) on (line_number)



Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-08-28 20:38 UTC] spam2 at rhsoft dot net
there is no reason to spit into the error log just because you don't properly check return values where you can calk error_log() at your own

proper production servers have E_ALL enabled and if something spits into my logs or enforce using @ would lead to a bug report to fix that broken behavior
 [2017-08-29 13:54 UTC] cmb@php.net
-Status: Open +Status: Duplicate -Assigned To: +Assigned To: cmb
 [2017-08-29 13:54 UTC] cmb@php.net
Duplicate of bug #51103.
 [2017-08-30 00:18 UTC] php at richardneill dot org
Thanks for your comments. I'd like to respond as to why I think this is important.

1. PCRE errors happen in 2 situations.
(a) The user has written a regex that "explodes". Something like this:
http://www.regular-expressions.info/catastrophic.html
This is what I did; it's a very hard to find error (and worse, it manifests intermittently at runtime dependent on the data). On the other hand, it is quite easy to hit this (it seems that the combination of a negative assertion with two ungreedy '*' will do it on fairly small data-sizes). 
IMHO, this should be considered in the same category as a RE syntax error (which does get into error_log). 

(b) PHP itself has "failed" (this was once the case when the PCRE_backtrack_limit defaulted to a tiny value, it can also happen with a segfault).

In both of these cases, an error in the logs is, I think, merited.


2. Please don't underestimate the sheer beauty of a helpful error message! It makes all the difference between a tool that is delightful to work with, and one that causes hours of frustration. Good error messages help the user, and make our job as (only-human) programmers fulfilling. Tools that fail silently, especially on really rare edge-cases, are not so fun to use, even if the programmer should have been responsible for carefully checking.


3. Finally, you say that with E_ALL, you wouldn't want to see this? I would suggest that any of the PCRE failures are worthy of a log - if they occur, it means a subtle and nasty bug is lurking in the code! This isn't like failing to check the return value of (say) file_get_contents(), which can "reasonably" fail. A PCRE error is tantamount to a crash - and there's no way to recover from it.

Thanks for your time and consideration.
 [2017-08-30 06:14 UTC] spam2 at rhsoft dot net
i do not underestimate the importance of error logs but I prefer to do a proper error handling at my own, that's why trigger_error() and error_log() exists and then I even make a full stop with a 500 response where php would just warn 

functions like file_exists() which bailout in case of one ../ too much when open_basedir is set when you just try to find on which of three possible locations a file exists let me puke each time I face it while is_file() has or at least had a different behavior
 [2017-08-30 12:14 UTC] cmb@php.net
> I'd like to respond as to why I think this is important. […]

I don't think that it makes much sense to discuss this issue in
the bug tracker. Actually, I made a PR[1] a while ago, but
emitting warnings appears to be too controversial, so an RFC[2]
would be needed, starting with a discussion on the internals
mailing list.

[1] <https://github.com/php/php-src/pull/2016>
[2] <https://wiki.php.net/rfc/howto>
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Dec 27 00:01:30 2024 UTC