php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #46904 preg_match() example #4 is wrong
Submitted: 2008-12-18 20:08 UTC Modified: 2009-09-12 06:57 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:1 (50.0%)
Same OS:2 (100.0%)
From: joe at digg dot com Assigned:
Status: Closed Package: Documentation problem
PHP Version: Irrelevant OS: Debian GNU/Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: joe at digg dot com
New email:
PHP Version: OS:

 

 [2008-12-18 20:08 UTC] joe at digg dot com
Description:
------------
On http://us.php.net/preg_match example #4 (Using named subpattern) is 
wrong. It shows:

<?php

$str = 'foobar: 2008';

preg_match('/(?<name>\w+): (?<digit>\d+)/', $str, $matches);

print_r($matches);

?>

The proper syntax for named expressions is (?P<foo>). 

Expected result:
----------------
<?php

$str = 'foobar: 2008';

preg_match('/(?P<name>\w+): (?P<digit>\d+)/', $str, $matches);

print_r($matches);

?>






Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-12-18 20:21 UTC] tobias382 at gmail dot com
Patch for /phpdoc/en/reference/pcre/functions/preg-match.xml:

278c278
< preg_match('/(?<name>\w+): (?<digit>\d+)/', $str, $matches);
---
> preg_match('/(?P<name>\w+): (?P<digit>\d+)/', $str, $matches);
 [2008-12-19 10:05 UTC] rquadling@php.net
I'm not so sure.

Using RegexBuddy to explain the different Regexs ...

There seems to be no difference between the 2 forms.




(?<name>\w+): (?<digit>\d+)

Options: case insensitive; ^ and $ match at line breaks

Match the regular expression below and capture its match into 
backreference with name “name” «(?<name>\w+)»
   Match a single character that is a “word character” (letters, 
digits, etc.) «\w+»
      Between one and unlimited times, as many times as possible, 
giving back as needed (greedy) «+»
Match the characters “: ” literally «: »
Match the regular expression below and capture its match into 
backreference with name “digit” «(?<digit>\d+)»
   Match a single digit 0..9 «\d+»
      Between one and unlimited times, as many times as possible, 
giving back as needed (greedy) «+»







(?P<name>\w+): (?P<digit>\d+)

Options: case insensitive; ^ and $ match at line breaks

Match the regular expression below and capture its match into 
backreference with name “name” «(?P<name>\w+)»
   Match a single character that is a “word character” (letters, 
digits, etc.) «\w+»
      Between one and unlimited times, as many times as possible, 
giving back as needed (greedy) «+»
Match the characters “: ” literally «: »
Match the regular expression below and capture its match into 
backreference with name “digit” «(?P<digit>\d+)»
   Match a single digit 0..9 «\d+»
      Between one and unlimited times, as many times as possible, 
giving back as needed (greedy) «+»





 [2008-12-19 10:16 UTC] rquadling@php.net
According to the help for RegexBuddy ...

(?P<name>group) came from Python.

The PCRE followed Python's lead.

PHP offers the same functionality

So, initially you look correct.

But, again from the RegexBuddy help ...

"The regular expression classes of the .NET framework also support 
named capture. Unfortunately, the Microsoft developers decided to 
invent their own syntax, rather than follow the one pioneered by 
Python. Currently, no other regex flavor supports Microsoft's version 
of named capture.

Here is an example with two capturing groups in .NET style: (?
<first>group)(?'second'group). As you can see, .NET offers two 
syntaxes to create a capturing group: one using sharp brackets, and 
the other using single quotes. The first syntax is preferable in 
strings, where single quotes may need to be escaped. The second 
syntax is preferable in ASP code, where the sharp brackets are used 
for HTML tags. You can use the pointy bracket flavor and the quoted 
flavors interchangeably.

To reference a capturing group inside the regex, use \k<name> or 
\k'name'. Again, you can use the two syntactic variations 
interchangeably."

This info is also available on http://www.regular-
expressions.info/named.html

So, it seems PHP actually supports PCRE/Python's and Microsoft's 
mechanisms.

Ideally we should be reflecting the PCRE route but have a note that 
other mechanisms are supported.


Finally on this (from http://perldoc.perl.org/perlre.html - scroll 
down to "Capture Buffers").

"Additionally, as of Perl 5.10.0 you may use named capture buffers 
and named backreferences. The notation is (?<name>...) to declare and 
\k<name> to reference. You may also use apostrophes instead of angle 
brackets to delimit the name; and you may use the bracketed \g{name} 
backreference syntax. It's possible to refer to a named capture 
buffer by absolute and relative number as well. Outside the pattern, 
a named capture buffer is available via the %+ hash. When different 
buffers within the same pattern have the same name, $+{name} and 
\k<name> refer to the leftmost defined group. (Thus it's possible to 
do things with named capture buffers that would otherwise require (??
{}) code to accomplish.)"


So, there is a differentiation between named captures and named 
backreferences.

(?<name>regex>) is a named capture. You cannot use the name of the 
capture within the regex or the replace (if search/replacing).

So, technically and being ever so slightly picky, the documentation 
is correct.

But really it is incomplete. I'll try and put some more examples in 
differentiating between named captures and named backreferences.



 [2008-12-23 19:07 UTC] felipe@php.net
Says the PCRE documentation:
"In PCRE, a subpattern can be named in one of three ways: (?<name>...) or (?'name'...) as in Perl, or (?P<name>...) as in  Python."
 [2008-12-23 19:13 UTC] joe at digg dot com
This isn't bogus. At some point this was NOT valid, but now appears to 
be valid. In 5.2.6 it works fine, but in 5.2.0 it does NOT:

jstump@devwww25:~$ php -v && php -q foo.php 
PHP 5.2.0-8+etch1 (cli) (built: Mar  8 2007 09:15:48) 
Copyright (c) 1997-2006 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2006 Zend Technologies
PHP Warning:  preg_match(): Compilation failed: unrecognized character 
after (?< at offset 3 in /home/jstump/foo.php on line 5

Warning: preg_match(): Compilation failed: unrecognized character after 
(?< at offset 3 in /home/jstump/foo.php on line 5
 [2008-12-23 19:21 UTC] felipe@php.net
Yes, It's expected. But Ok, we should specify a PCRE minimum version, 7.0 (bundled as of PHP 5.2.2)
 [2008-12-23 19:31 UTC] philip@php.net
And for good measure, here's a partial PCRE History in PHP (from changelogs):

5.0.0: 4.5
5.0.5: 5.0
5.1.0: 6.2
5.1.3: 6.6
5.2.0: 6.7
5.2.2: 7.0
5.2.4: 7.2
5.2.5: 7.3
5.2.6: 7.6
5.2.7: 7.8

4.4.9: 7.7
 [2009-09-12 06:55 UTC] svn@php.net
Automatic comment from SVN on behalf of torben
Revision: http://svn.php.net/viewvc/?view=revision&revision=288276
Log: Document that since PHP 5.2.2, named subpatterns accept the syntax
(?<name>) and (?'name') as well as the older (?P<name>).
Addresses bug #46904.
 [2009-09-12 06:57 UTC] torben@php.net
This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.


 [2009-12-17 12:28 UTC] svn@php.net
Automatic comment from SVN on behalf of seld
Revision: http://svn.php.net/viewvc/?view=revision&revision=292248
Log: Emphasize the use of ?P<> for named sub-patterns instead of the backwards incompatible ?<>. Fixes bug #50306 - refs bug #46904
 
PHP Copyright © 2001-2026 The PHP Group
All rights reserved.
Last updated: Wed Jun 17 12:00:02 2026 UTC