|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #77744 Add preg_match flag to return only offsets
Submitted: 2019-03-14 18:39 UTC Modified: 2019-03-21 11:36 UTC
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: cananian at wikimedia dot org Assigned: nikic (profile)
Status: Assigned Package: PCRE related
PHP Version: 7.3.3 OS:
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: cananian at wikimedia dot org
New email:
PHP Version: OS:


 [2019-03-14 18:39 UTC] cananian at wikimedia dot org
When creating a high-performance lexer (for example, an HTML5 parser), it is worthwhile to try to avoid the number of string copies made.  You can perform matches using offsets into your master source string.  However, preg_match will copy a substring for the entire matched region ($matches[0]) as well as for all captured patterns.  This can get expensive if the matched region/captured patterns are very large.

It would be helpful if PHP's preg_match* functions offered a flag, say PREG_LENGTH_CAPTURE, which returned the numeric length instead of the string.  In combination, PREG_OFFSET_CAPTURE|PREG_LENGTH_CAPTURE would return the numeric length in element 0 and the offset in element 1, and avoid the need to copy the matched substring unnecessarily.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2019-03-21 11:36 UTC]
-Assigned To: +Assigned To: nikic
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 12 15:01:30 2024 UTC