php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #49198 Incorrect result for .*
Submitted: 2009-08-08 20:08 UTC Modified: 2009-08-10 16:04 UTC
Votes:2
Avg. Score:2.5 ± 1.5
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: inf3rno dot hu at gmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.2.10 OS: *
Private report: No CVE-ID: None
 [2009-08-08 20:08 UTC] inf3rno dot hu at gmail dot com
Description:
------------
For pattern: .* there is an empty match at the end of the string.

Reproduce code:
---------------
$p1='/.*/';
$p2='/.*$/';
$p3='/^.*/';
$p4='/^.*$/';
$test='some text';

function test($p,$t)
{
	preg_match_all($p,$t,$m,PREG_SET_ORDER);
	echo $p.'<br />';
	if (count($m)==1)
		echo '<div style="color: green;">ok</div>';
	else
		echo '<div style="color: red;">bug</div>';
	echo '<pre>'.var_export($m,true).'</pre>';
	echo '<br /><br />';
}

test($p1,$test);
test($p2,$test);
test($p3,$test);
test($p4,$test);

Expected result:
----------------
I'm expecting one match in the preg_match_all result array, and I will get two instead of one. The second match is empty.

Actual result:
--------------
/.*/
bug

array (
  0 => 
  array (
    0 => 'some text',
  ),
  1 => 
  array (
    0 => '',
  ),
)



/.*$/
bug

array (
  0 => 
  array (
    0 => 'some text',
  ),
  1 => 
  array (
    0 => '',
  ),
)



/^.*/
ok

array (
  0 => 
  array (
    0 => 'some text',
  ),
)



/^.*$/
ok

array (
  0 => 
  array (
    0 => 'some text',
  ),
)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-08-08 21:04 UTC] pajoye@php.net
Not windows specific
 [2009-08-09 10:49 UTC] inf3rno dot hu at gmail dot com
It's not preg_match_all specific, same bug with every preg function.
 [2009-08-09 11:11 UTC] rasmus@php.net
If you change:
preg_match_all($p,$t,$m,PREG_SET_ORDER);
to:
preg_match($p,$t,$m);

There is no empty match.  I get this output:

/.*/<br /><div style="color: green;">ok</div><pre>array (
  0 => 'some text',
)</pre><br /><br />/.*$/<br /><div style="color: green;">ok</div><pre>array (
  0 => 'some text',
)</pre><br /><br />/^.*/<br /><div style="color: green;">ok</div><pre>array (
  0 => 'some text',
)</pre><br /><br />/^.*$/<br /><div style="color: green;">ok</div><pre>array (
  0 => 'some text',
)</pre><br /><br />

If you can get this effect with preg_match(), please show how.
 [2009-08-09 11:55 UTC] rasmus@php.net
I am also pretty sure that this isn't actually a bug.  Doing a match_all on a non-anchored pattern containing .* is going to match an empty string.  Remember that * means 0 or more instances of the previous term.  So, you are doing a match_all for 0 or more characters, and when you do this non-anchored you are going to get an empty string matching that.  Change it to .+ (+ means 1 or more) and your patterns start to make sense and as you will see, the output is what you expect.
 [2009-08-10 15:42 UTC] inf3rno dot hu at gmail dot com
I don't agree. How do you explain the same behaviour with the .*$ pattern? 
I think .* have to return a single string not two, it's simple logic. One match for one string.
 [2009-08-10 16:04 UTC] inf3rno dot hu at gmail dot com
Tried out in javascript too:
same result, so I were wrong :-) sorry
<body onload="init();">
php:<br />
<?php
$p1='/.*/';
$test='some text';

function test($m)
{
	echo '"'.$m[0].'"';
	echo '<br />';
	return $m[0];
}

preg_replace_callback($p1,'test',$test);
?><br />
javascript:<br />
<script>
function init()
{
	var p1=<?php echo $p1;?>g;
	var test="<?php echo $test;?>";
	test.replace(p1,function (m)
	{
		document.body.appendChild(document.createTextNode('"'+m+'"'));
		document.body.appendChild(document.createElement('br'));
		return m;
	});
}
</script>
</body>
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 15 08:01:29 2025 UTC