php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #62360 Five valid PCRE escape sequences not documented
Submitted: 2012-06-19 01:28 UTC Modified: 2016-12-31 00:44 UTC
Votes:8
Avg. Score:4.0 ± 1.0
Reproduced:3 of 5 (60.0%)
Same Version:2 (66.7%)
Same OS:2 (66.7%)
From: danielklein at airpost dot net Assigned:
Status: Open Package: PCRE related
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: danielklein at airpost dot net
New email:
PHP Version: OS:

 

 [2012-06-19 01:28 UTC] danielklein at airpost dot net
Description:
------------
---
From manual page: http://www.php.net/regexp.reference.escape
---
There are currently five valid PCRE escape sequences not documented on this page: \C, \R, \X, \g & \k. Some of these are documented on other pages.

\g & \k - regexp.reference.back-references
\C - regexp.reference.dot
\R matches line break characters or combinations (see http://nikic.github.com/2011/12/10/PCRE-and-newlines.html)
\X matches Unicode graphemes (see http://www.regular-expressions.info/unicode.html)

Also, the description for \G is misleading. \G can also match in preg_match_all or preg_replace at the point where the previous match stopped (see example script).

Test script:
---------------
preg_match_all('/\b([^\Wv]+)\s+/X', "In the beginning the universe was created", $matches, PREG_PATTERN_ORDER, 1); // Matches four words not containing 'v', then followed by a space
var_export($matches[1]);
print("\n");
preg_match_all('/\G\b([^\Wv]+)\s+/X', "In the beginning the universe was created", $matches, PREG_PATTERN_ORDER, 1); // No matches
var_export($matches[1]);
print("\n");
preg_match_all('/\G\b([^\Wv]+)\s+/X', "In the beginning the universe was created", $matches, PREG_PATTERN_ORDER, 3); // Able to match \b at offset 3, then two others. No word can have 'v' in it. \G prevents skipping to next word
var_export($matches[1]);


Actual result:
--------------
array (
  0 => 'the',
  1 => 'beginning',
  2 => 'the',
  3 => 'was',
)
array (
)
array (
  0 => 'the',
  1 => 'beginning',
  2 => 'the',
)

Patches

BUGS (last revision 2014-01-04 10:00 UTC by minktee at hotmail dot com)
patch_CHANGES (last revision 2014-01-04 09:42 UTC by minktee at hotmail dot com)
patch_BUGS (last revision 2014-01-04 09:41 UTC by minktee at hotmail dot com)
Patch_banner (last revision 2014-01-04 09:34 UTC by minktee at hotmail dot com)

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-08-02 03:40 UTC] vovan-ve at yandex dot ru
Also \N is undocumented. It was introduced in PCRE/8.10 (PHP/5.3.4?).

Also some types of conditions in conditional subpattern (?( are undocumented. All available conditions are:

  n
  +n, -n         (PCRE >= 7.2, PHP >= 5.2.4)
  name           (PCRE >= 6.7, PHP >= 5.2.0)
  <name>, 'name' (PCRE >= 7.0, PHP >= 5.2.2)
  R
  Rn             (PCRE >= 7.0, PHP >= 5.2.2)
  R&name         (PCRE >= 7.0, PHP >= 5.2.2)
  ?assertion
  DEFINE         (PCRE >= 7.0, PHP >= 5.2.2)

Also (?C)/(?Cn) is not documented and not implemented. But it works and does nothing (PCRE >= 4.0).

The subject of the bug is "escape sequences not documented". Should it be renemed to "features not documented"?
 [2012-08-02 03:50 UTC] vovan-ve at yandex dot ru
Also recursive (?+n) and (?-n) are undocumented (PHP >= 5.2.4, PCRE >= 7.2). Back references in form \g<n>, \g<-n>, \g'n', \g'-n' (PHP >= 5.2.7, PCRE >= 7.7) are not mentioned.
 [2014-01-04 07:48 UTC] minktee at hotmail dot com
Document problem
 [2016-12-31 00:44 UTC] cmb@php.net
-Package: Documentation problem +Package: PCRE related
 [2021-02-02 05:59 UTC] jessiesilva1989 at gmail dot com
Perl's regular expressions are described in its own documentation, and regular by starting a pattern string with one of the following five sequences: This escaping action applies whether or not the following character would Such characters are not valid in Unicode strings and so cannot be tested.



https://www.omegle.run/
 [2022-12-26 08:37 UTC] nazi dot farhadi3171 at gmail dot com
I'm extremely intrigued with your post. I desire to get more incredible posts.
(https://www.ariseportal.org/)github.com
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 18:01:29 2024 UTC