php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #41385 preg_match_all does not find all matches
Submitted: 2007-05-13 15:10 UTC Modified: 2007-05-16 12:59 UTC
From: alexey at gmail dot com Assigned:
Status: Not a bug Package: *General Issues
PHP Version: 4.4.7 OS: linux
Private report: No CVE-ID: None
 [2007-05-13 15:10 UTC] alexey at gmail dot com
Description:
------------
I am unable to upgrade to 4.4.7 at this point. I am using 4.4.6 but the release notes did not include this problem as being fixed so I am reporting it now.

The code tries to find all tags, but it finds only few of them after spending a lot of CPU cycles. 



Reproduce code:
---------------
<?
$data = "

<HTML xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"en\">
<body>
<P>It's now possible to download whole Flash and WPF sites and run
them when you want. One example is the <EM>New York Times</EM> Reader,
for which I did some sketches, and which was reviewed on
TechnologyReview.com <EM>(see \"<A
href=http://www.technologyreview.com/Infotech/16760\"/>The </a> </em>
<a href=\"http://www.technologyreview.com/Infotech/16760/\">Times </a>
<em> <a href=\"http://www.technologyreview.com/Infotech/16760/\">
Emulates Print on the Web</a>,


</BODY>
</HTML>
";


preg_match_all("#</?([[:alpha:]]+)[[:space:]]*([^\">]*|(\"[^\">]*\"))*>#",
$data, $matches, PREG_OFFSET_CAPTURE);

print_r($matches[0]);


?>



Expected result:
----------------
It should include the following tags but it does not:


    [5] => Array
        (
            [0] => </a>
            [1] => 350
        )

    [6] => Array
        (
            [0] => </em>
            [1] => 355
        )

    [7] => Array
        (
            [0] => </a>
            [1] => 425
        )

    [8] => Array
        (
            [0] => <em>
            [1] => 430
        )

    [9] => Array
        (
            [0] => </a>
            [1] => 519
        )

    [10] => Array
        (
            [0] => </BODY>
            [1] => 527
        )

    [11] => Array
        (
            [0] => </HTML>
            [1] => 535
        )


Actual result:
--------------
    [0] => Array
        (
            [0] => <HTML xmlns="http://www.w3.org/1999/xhtml" lang="en">
            [1] => 2
        )

    [1] => Array
        (
            [0] => <body>
            [1] => 56
        )

    [2] => Array
        (
            [0] => <P>
            [1] => 63
        )

    [3] => Array
        (
            [0] => <EM>
            [1] => 169
        )

    [4] => Array
        (
            [0] => </EM>
            [1] => 187
        )

    [5] => Array
        (
            [0] => <EM>
            [1] => 279
        )



Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-05-16 12:59 UTC] tony2001@php.net
Sorry, but your problem does not imply a bug in PHP itself.  For a
list of more appropriate places to ask for help using PHP, please
visit http://www.php.net/support.php as this bug system is not the
appropriate forum for asking support questions.  Due to the volume
of reports we can not explain in detail here why your report is not
a bug.  The support channels will be able to provide an explanation
for you.

Thank you for your interest in PHP.


 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Sat Jul 26 21:00:02 2025 UTC