|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2019-01-25 15:53 UTC] flip101 at gmail dot com
Description: ------------ At the moment there is poor support to validate a regex. Since the inception of PHP where it was mainly used for creating personal home pages it's been used in more domains. Therefor having the ability to validate regexes would be a useful thing. As far as i know regexes can be validated to the PCRE engine in the following way: 1. make a custom error handler for PHP warning 2. register customer error handler 3. call preg_match with invalid regex 4. unregister custom error handler. 5. throw exception (or another way of dealing with the error) The motivation that led me here is the following library and issue: https://hoa-project.net/En/Literature/Hack/Compiler.html#PP_language https://github.com/hoaproject/Compiler/issues/15 But i believe there will be other projects now and in the feature that would benefit from a change in PHP PCRE module. I propose the following changes that should be backwards compatible: 1. Add constant PREG_INVALID_PATTERN (representing int 7) to https://secure.php.net/manual/en/function.preg-last-error.php which gets set when a pattern is an invalid regex. 2. Add a bool preg_validate_pattern($string) function that only calls the PCRE2 compile function and checks for succesful compilation. Skipping the step of actually trying to match a subject which is usually done. Test script: --------------- <?php $invalid_pattern = '/(\d+/'; $dummy_subject = ''; preg_match($invalid_pattern, $dummy_subject); PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Tue Oct 28 22:00:01 2025 UTC |
I believe using preg_last_error() and silencing the preg_match error gives a sane way to validate if a regex is acceptable to PCRE. $invalid_pattern = '/(\d+/'; $dummy_subject = ''; @preg_match($invalid_pattern, $dummy_subject); $lastError = preg_last_error(); if ($lastError) { echo "something was wrong with the regex"; } Does that not cover detecting valid regexes? Admittedly getting the exact position of the error would be more than a little useful...By the way there is also the return value of preg_match set to false when the pattern match fails. But both the preg_last_error function and the return value can not differentiate between an invalid pattern and another general error. As you point out danack it would be good to have better information about the error. Perhaps preg_validate_pattern could return an array like: $return = array('message' => 'missing closing parenthesis', 'offset' => 4);This is more of an inconvenience than a major issue in practice - this can be abstracted into a composer library (which may already exist, I haven't checked). - I personally want this feature - The amount of setup and teardown involved is inefficient, and reinventing regex validation is error prone and misses edge cases I slightly prefer preg_replace over preg_match, because preg_replace will warn about the `/e` modifier being removed and doing nothing, while preg_match doesn't. $result = @\preg_replace($pattern, '', ''); if ($result === false || $result === null) { return \error_get_last() ?? []; } return null; I ran into similar issues writing a static analysis plugin that would warn about invalid PCRE regexes passed to preg_match, etc. The code used is https://github.com/phan/phan/blob/1.2.3/.phan/plugins/PregRegexCheckerPlugin.php#L41-L58I presume that providing a wrapper for pcre_get_compiled_regex_cache_ex() would do the trick; something like preg_validate(string $pattern): int