php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #77744 Add preg_match flag to return only offsets
Submitted: 2019-03-14 18:39 UTC Modified: 2019-03-21 11:36 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: cananian at wikimedia dot org Assigned: nikic (profile)
Status: Assigned Package: PCRE related
PHP Version: 7.3.3 OS:
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: cananian at wikimedia dot org
New email:
PHP Version: OS:

 

 [2019-03-14 18:39 UTC] cananian at wikimedia dot org
Description:
------------
When creating a high-performance lexer (for example, an HTML5 parser), it is worthwhile to try to avoid the number of string copies made.  You can perform matches using offsets into your master source string.  However, preg_match will copy a substring for the entire matched region ($matches[0]) as well as for all captured patterns.  This can get expensive if the matched region/captured patterns are very large.

It would be helpful if PHP's preg_match* functions offered a flag, say PREG_LENGTH_CAPTURE, which returned the numeric length instead of the string.  In combination, PREG_OFFSET_CAPTURE|PREG_LENGTH_CAPTURE would return the numeric length in element 0 and the offset in element 1, and avoid the need to copy the matched substring unnecessarily.


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-03-21 11:36 UTC] nikic@php.net
-Assigned To: +Assigned To: nikic
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 12:01:29 2024 UTC