php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #80652 Leading whitespace in a regex is ignored
Submitted: 2021-01-21 13:13 UTC Modified: 2021-01-21 13:30 UTC
From: greg at subaqua dot co dot uk Assigned:
Status: Closed Package: PCRE related
PHP Version: Irrelevant OS: n/a
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: greg at subaqua dot co dot uk
New email:
PHP Version: OS:

 

 [2021-01-21 13:13 UTC] greg at subaqua dot co dot uk
Description:
------------
The documentation at https://www.php.net/manual/en/regexp.reference.delimiters.php says that:

When using the PCRE functions, it is required that the pattern is enclosed by delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character.

The validation for delimiters works in *most* cases and gives a warning such as:

Warning: preg_split(): Delimiter must not be alphanumeric or backslash in php shell code on line 1

However, it fails to detect the invalid delimiters in the string " *, *".

Instead, it runs without error/warning and gives output based on a substring(?) of the supplied regex.

According to https://3v4l.org/DdBp1 it affects all versions of PHP from 4.3 to 8.0

Test script:
---------------
# Valid delimiters - PASS
php > var_export(preg_split('/ *, */', 'a , b'));
array (
  0 => 'a',
  1 => 'b',
)

# Invalid delimiters - PASS - gives warning as expected
php > var_dump(preg_split('X *, *Y', 'a , b'));

Warning: preg_split(): Delimiter must not be alphanumeric or backslash in php shell code on line 1
bool(false)

# Invalid delimiters - FAIL - should warn about either mismatched or invalid delimiters.
# Instead, it gives no warning and unexpected output.
php > var_export(preg_split(' *, *', 'a , b'));
array (
  0 => 'a ',
  1 => 'b',
)


Expected result:
----------------
I expect the invalid delimiters to trigger a warning.

Actual result:
--------------
The invalid delimiters are silently ignored.
The output does not correspond to the supplied regex.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-01-21 13:30 UTC] requinix@php.net
-Summary: Regex delimiters not validated correctly +Summary: Leading whitespace in a regex is ignored -Type: Bug +Type: Documentation Problem
 [2021-01-21 13:30 UTC] requinix@php.net
Leading whitespace in a pattern is ignored. The delimiters thus are the asterisks, and the pattern splits on comma + space.

A quick comment in the docs regarding the leading whitespace sounds reasonable.
 [2021-03-16 16:38 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=doc/en.git;a=commit;h=7962c3e6a2af22dbc568e3cd117287d8baec8e80
Log: Fix #80652: Leading whitespace in a regex is ignored
 [2021-03-16 16:38 UTC] cmb@php.net
-Status: Open +Status: Closed
 [2021-03-18 17:25 UTC] mumumu@php.net
Automatic comment on behalf of mumumu@mumumu.org
Revision: http://git.php.net/?p=doc/ja.git;a=commit;h=8c734867b45e5b16fdcab7ae5d09bca89f40f2a9
Log: Fix #80652: Leading whitespace in a regex is ignored
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Mar 19 06:01:30 2024 UTC