php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #81424 PCRE2 10.35 JIT performance regression
Submitted: 2021-09-07 17:11 UTC Modified: 2021-10-25 16:35 UTC
From: nospam at briat dot org Assigned: cmb (profile)
Status: Closed Package: PCRE related
PHP Version: 7.4 OS: *
Private report: No CVE-ID: None
 [2021-09-07 17:11 UTC] nospam at briat dot org
Description:
------------
When debugging a sudden slow down (20 time slow) in https://www.drupal.org/project/advagg module for Drupal 7 which use this cssmin.php v2.4.8-4, I discover that its massive uses of preg_replace was the core of the problem.

I notice that I have the problem with some versions of PHP:
PHP 7.3.27-9+0~20210227.82+debian9~1.gbpa4a3d6 (cli) (built: Feb 27 2021 15:51:31)  or php:7.4.12-cli-alpine 

but not with PHP 7.3.12 (cli) (built: Nov 22 2019 16:32:31) ( NTS ) or php:7.4.11-cli-alpine 

I made a small script that run different version of php with docker and the gap occur on php:7.4.12-cli-alpine or +.

Since it not quite related to the PHP version, I could be related to the PCRE Library Version  10.35 or + or the #81243.


Test script:
---------------
# Create a huge css file with one selector of 15k char on the first line 
# following by some rules 

composer require tubalmartin/cssmin

<?php
require './vendor/autoload.php';

use tubalmartin\CssMin\Minifier as CSSmin;

$my_cssmin = new CSSmin(true);
$data = file_get_contents('test.css');
$time_start = microtime(true);
$data = $my_cssmin->run($data, 4096);

echo PHP_EOL . strlen($data) . ' char. in ' . ( microtime(true) - $time_start ) . 's.' . PHP_EOL;


Expected result:
----------------
the two docker executions should have similar execution times 

Actual result:
--------------
docker run -it --rm --name my-running-script -v "$PWD":/usr/src/myapp -w /usr/src/myapp php:7.4.11-cli-alpine  php test.php
17091 char. in 0.0034229755401611s.
docker run -it --rm --name my-running-script -v "$PWD":/usr/src/myapp -w /usr/src/myapp php:7.4.11-cli-alpine  php  -i |egrep "(PCRE|reg)"
register_argc_argv => On => On
Multibyte (japanese) regex support => enabled
Multibyte regex (oniguruma) version => 6.9.5
mbstring.regex_retry_limit => 1000000 => 1000000
mbstring.regex_stack_limit => 100000 => 100000
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.34 2019-11-21
PCRE Unicode Version => 12.1.0
PCRE JIT Support => enabled
PCRE JIT Target => x86 64bit (little endian + unaligned)
Phar fully realized by Gregory Beaver and Marcus Boerger.



docker run -it --rm --name my-running-script -v "$PWD":/usr/src/myapp -w /usr/src/myapp php:7.4.12-cli-alpine  php test.php
lorem: 17091char. in 0.20809006690979s.

docker run -it --rm --name my-running-script -v "$PWD":/usr/src/myapp -w /usr/src/myapp php:7.4.12-cli-alpine  php  -i |egrep "(PCRE|reg)"
register_argc_argv => On => On
Multibyte (japanese) regex support => enabled
Multibyte regex (oniguruma) version => 6.9.5
mbstring.regex_retry_limit => 1000000 => 1000000
mbstring.regex_stack_limit => 100000 => 100000
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.35 2020-05-09
PCRE Unicode Version => 13.0.0
PCRE JIT Support => enabled
PCRE JIT Target => x86 64bit (little endian + unaligned)
Phar fully realized by Gregory Beaver and Marcus Boerger.

Patches

Add a Patch

Pull Requests

Pull requests:

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-09-07 20:37 UTC] dharman@php.net
-Status: Open +Status: Feedback
 [2021-09-07 20:37 UTC] dharman@php.net
PHP 7.3 isn't actively maintained anymore. Do you see this problem with PHP 7.4.23 or with PHP 8.0.10? If the problem can't be reproduced with active PHP versions then I think this bug report should be closed. The issue could have been fixed in the meantime.
 [2021-09-07 21:02 UTC] nospam at briat dot org
-Status: Feedback +Status: Open
 [2021-09-07 21:02 UTC] nospam at briat dot org
Yes sorry, it' goes up to 7.4.23 and  8.0.8.
 [2021-09-08 08:50 UTC] nikic@php.net
> Create a huge css file with one selector of 15k char on the first line 
> following by some rules 

Could you please provide the CSS file you used for testing?
 [2021-09-08 10:05 UTC] nospam at briat dot org
here a sample css file (not valid) :
https://pastebin.com/WVBR4f9T
Results on my computer :
❯ docker run -it --rm --name my-running-script -v "$PWD":/usr/src/myapp -w /usr/src/myapp php:7.4.11-cli-alpine php test.php
Size: 17709bytes, duration: 0.0039818286895752s.
❯ docker run -it --rm --name my-running-script -v "$PWD":/usr/src/myapp -w /usr/src/myapp php:7.4.12-cli-alpine php test.php
Size: 17709bytes, duration: 0.25268316268921s.
 [2021-09-08 10:07 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2021-09-08 10:07 UTC] cmb@php.net
The regression can't be related to #81243, because that is only
fixed as of PHP 7.4.22.  It is more likely that it is related
to the PCRE version update.

> Could you please provide the CSS file you used for testing?

Yes, please.  And don't post it here in the bug tracker, but rather
put it on gist.github.com or so, and post the link here.
 [2021-09-08 10:08 UTC] cmb@php.net
-Status: Feedback +Status: Open
 [2021-09-08 11:09 UTC] cmb@php.net
-Summary: Since PCRE 10.35, pc +Summary: PCRE2 10.35 JIT performance regression -Status: Assigned +Status: Verified -PHP Version: 7.3Git-2021-09-07 (snap) +PHP Version: 7.4
 [2021-09-08 11:09 UTC] cmb@php.net
I can confirm the serious performance regression, and that it is
caused by the update to PCRE2 10.35.  The issue persists with
current master, which has PCRE2 10.37.  It is possible that the
issue has been fixed in PCRE2 10.38-RC1, but I haven't verified
that yet.  Just in case the issue has not been fixed upstream,
we'd need a simpler reproducer before reporting upstream.
 [2021-09-08 11:10 UTC] cmb@php.net
-Operating System: linux (centos,WSL docker alpine) +Operating System: *
 [2021-09-08 12:16 UTC] cmb@php.net
Simpler reproducer:

<?php
$data = file_get_contents("lorem.css.txt");
$time_start = hrtime(true);
var_dump(preg_match('/[^{};\/\n]+\{\}/S', $data));
var_dump(
    strlen($data),
    (hrtime(true) - $time_start) / 100000
);
?>

Yields a performance regression by more than factor 100 for me.
I'm not sure, though, whether the PCRE2 maintainers would fix
this, since the regex appears to be suboptimal.  Using a look
behind assertion

var_dump(preg_match('/(?<![{};\/\n]+)\{\}/S', $data));

instead, makes almost no performance difference.
 [2021-09-08 12:34 UTC] cmb@php.net
Anyhow, I filed <https://github.com/PhilipHazel/pcre2/issues/16>.
 [2021-09-08 13:47 UTC] nospam at briat dot org
Thanks for the analysis.
I will also fill an issue on https://github.com/tubalmartin/YUI-CSS-compressor-PHP-port.
 [2021-09-08 14:41 UTC] nospam at briat dot org
Here the issue https://github.com/tubalmartin/YUI-CSS-compressor-PHP-port/issues/63

For the record, I think that your regex is maybe equivalent for matching but not for replacing: $css = preg_replace('/[^{};\/\n]+\{\}/S', '', $css); => match and replace the whole css declaration.
 [2021-09-08 15:02 UTC] cmb@php.net
> […] I think that your regex is maybe equivalent for matching but
> not for replacing […]

Oh, right.  Maybe some other improvement can be made for that.
Anyway, let's wait on the upstream decision. :)
 [2021-09-10 10:47 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #81424: PCRE2 10.35 JIT performance regression
On GitHub:  https://github.com/php/php-src/pull/7484
Patch:      https://github.com/php/php-src/pull/7484.patch
 [2021-09-13 12:40 UTC] git@php.net
Automatic comment on behalf of cmb69
Revision: https://github.com/php/php-src/commit/a2471383fec332ae30827c7e3f4f9451420f1f0b
Log: Fix #81424: PCRE2 10.35 JIT performance regression
 [2021-09-13 12:40 UTC] git@php.net
-Status: Verified +Status: Closed
 [2021-10-05 09:55 UTC] cmb@php.net
Since this fix introduces a functional regression[1], it has been
reverted for now.

[1] <https://github.com/PhilipHazel/pcre2/issues/21>
 [2021-10-05 09:55 UTC] cmb@php.net
-Status: Closed +Status: Re-Opened
 [2021-10-12 10:07 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #81424: PCRE2 10.35 JIT performance regression
On GitHub:  https://github.com/php/php-src/pull/7573
Patch:      https://github.com/php/php-src/pull/7573.patch
 [2021-10-12 12:23 UTC] git@php.net
Automatic comment on behalf of cmb69
Revision: https://github.com/php/php-src/commit/788a701e222c70823472ef13d20bbfc794ebd82c
Log: Fix #81424: PCRE2 10.35 JIT performance regression
 [2021-10-12 12:23 UTC] git@php.net
-Status: Re-Opened +Status: Closed
 [2021-10-25 16:09 UTC] hegyre at gmail dot com
Hello!
Does it means this is fixed in PHP 7.4.25 ?
 [2021-10-25 16:35 UTC] cmb@php.net
> Does it means this is fixed in PHP 7.4.25 ?

No, it was too late for that version, but it should be fixed as of
PHP 7.4.26.
 [2021-10-27 23:33 UTC] apostnikov at gmail dot com
Thank you for test, I got it failed building 8.1.0RC5 for Alpinelinux edge and it helped to catch regression in pre2 https://github.com/PhilipHazel/pcre2/pull/22
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Sun Aug 14 04:05:44 2022 UTC