php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #77744 Add preg_match flag to return only offsets
Submitted: 2019-03-14 18:39 UTC Modified: 2019-03-21 11:36 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: cananian at wikimedia dot org Assigned: nikic (profile)
Status: Assigned Package: PCRE related
PHP Version: 7.3.3 OS:
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2019-03-14 18:39 UTC] cananian at wikimedia dot org
Description:
------------
When creating a high-performance lexer (for example, an HTML5 parser), it is worthwhile to try to avoid the number of string copies made.  You can perform matches using offsets into your master source string.  However, preg_match will copy a substring for the entire matched region ($matches[0]) as well as for all captured patterns.  This can get expensive if the matched region/captured patterns are very large.

It would be helpful if PHP's preg_match* functions offered a flag, say PREG_LENGTH_CAPTURE, which returned the numeric length instead of the string.  In combination, PREG_OFFSET_CAPTURE|PREG_LENGTH_CAPTURE would return the numeric length in element 0 and the offset in element 1, and avoid the need to copy the matched substring unnecessarily.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-03-21 11:36 UTC] nikic@php.net
-Assigned To: +Assigned To: nikic
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Fri Nov 22 22:01:27 2019 UTC