php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #77744 Add preg_match flag to return only offsets
Submitted: 2019-03-14 18:39 UTC Modified: 2019-03-21 11:36 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: cananian at wikimedia dot org Assigned: nikic (profile)
Status: Assigned Package: PCRE related
PHP Version: 7.3.3 OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: cananian at wikimedia dot org
New email:
PHP Version: OS:

 

 [2019-03-14 18:39 UTC] cananian at wikimedia dot org
Description:
------------
When creating a high-performance lexer (for example, an HTML5 parser), it is worthwhile to try to avoid the number of string copies made.  You can perform matches using offsets into your master source string.  However, preg_match will copy a substring for the entire matched region ($matches[0]) as well as for all captured patterns.  This can get expensive if the matched region/captured patterns are very large.

It would be helpful if PHP's preg_match* functions offered a flag, say PREG_LENGTH_CAPTURE, which returned the numeric length instead of the string.  In combination, PREG_OFFSET_CAPTURE|PREG_LENGTH_CAPTURE would return the numeric length in element 0 and the offset in element 1, and avoid the need to copy the matched substring unnecessarily.


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-03-21 11:36 UTC] nikic@php.net
-Assigned To: +Assigned To: nikic
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Nov 24 02:01:28 2024 UTC