php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #66792 add callback function to preg_match
Submitted: 2014-02-27 14:59 UTC Modified: 2018-03-21 13:07 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: crayonviolent at phpfreaks dot com Assigned:
Status: Suspended Package: PCRE related
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please — but make sure to vote on the bug!
Your email address:
MUST BE VALID
Solve the problem:
39 + 30 = ?
Subscribe to this entry?

 
 [2014-02-27 14:59 UTC] crayonviolent at phpfreaks dot com
Description:
------------
Pretty straightforward.. Add callback function capability to preg_match and preg_match_all. The requirements should be to ultimately return true or false. If true then it will count as a match. If false, it will not count as a match. 

This solution would allow injection of arbitrary logic, and I feel this will both greatly enhance and simplify a lot of common scenarios that benefit from using regex, but are unnecessarily complex because of limitations of regex, e.g number ranges and dates. 



Test script:
---------------
/* 
  use case:
  I expect an integer range from user, and I want to 
  make sure the 'min' number is less than or equal to the 'max' num
*/
$string = "45-50";
if ( preg_match('~^(\d+)-(\d+)$~', $string, $match,
  function($m) { return ($m[1] <= $m[2]); }
) {
  // valid
} else {
  // invalid
}




Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-02-27 19:03 UTC] requinix@php.net
Maybe it's just the example but you can easily put that condition in your if. Besides, doesn't adding a callback make it "unnecessarily complex"?

preg_match('~^(\d+)-(\d+)$~', $string, $match) && $match[1] <= $match[2]

For preg_match_all() you can use PREG_SET_ORDER and array_filter().
 [2014-02-27 20:51 UTC] crayonviolent at phpfreaks dot com
Well, I wasn't looking for a way to do it without the proposed feature, and your  alternatives are great against my scenario, but very quickly break down if you start thinking about logic that can't just be stuffed into a condition like that. 

Another alternative would be to use preg_replace_callback and build my own array for matches from the callback. 

I think we can come up with lots of ways to skin that cat, just like there are lots of ways to skin the cat for virtually everything. Therefore, I do not think a feature request should be dismissed simply because there are other ways of doing it. If that were the case, then php (and other languages) could do away with like 50%+ its syntax/structure. 

"Overly complex."  That's a fair point. But considering the average programmer understands php syntax and callback functions a LOT better than regex rules and syntax, it seems the lesser of the evil would be to put the "complexity" in php's hands rather than the regex engine. 

Yes, my example was a bit simplistic. I invite people to think of all the complex patterns out there and how they could be simplified. It's really no different than what someone would do anyways, only offering an alternative syntax. 

I know you've seen many patterns for things like number ranges, dates, IP address ranges and the like.  Those patterns are long and confusing because there isn't a straight forward way to do it. It involves multiple character classes and alternations to achieve it.  

In any case, whatever arguments there are about whether or not it's "overly complex" are moot, because this is essentially the same thing that already exists in preg_replace_callback, except for the return value and end goal. Look at preg_replace_callback: it already has an arg for a callback function. The current match set is passed to it and you can execute arbitrary php code and return the value to be the replacement.  

So like.. seems to me that like 75-85% of the "work" involved in getting a callback to happen in the regex engine is already there, and would simply be a matter of referencing or duping from preg_replace_callback's internal coding.

So, I'm not really suggesting that this brings some new mechanic to the table that php cannot currently solve.  It's more of an alternate syntax suggestion.
 [2018-03-21 13:07 UTC] cmb@php.net
-Status: Open +Status: Suspended
 [2018-03-21 13:07 UTC] cmb@php.net
This feature is obviously controversial, and as such would require
the RFC process[1].  Anybody is welcome to start it! For the time
being, I'm suspending this ticket.

[1] <https://wiki.php.net/rfc/howto>
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Sat Sep 19 10:01:24 2020 UTC