php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50605 preg_split Wrong count of matches with PREG_SPLIT_DELIM_CAPTURE
Submitted: 2009-12-29 12:57 UTC Modified: 2009-12-29 13:57 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:2 (100.0%)
From: serovov at gmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.2.12 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: serovov at gmail dot com
New email:
PHP Version: OS:

 

 [2009-12-29 12:57 UTC] serovov at gmail dot com
Description:
------------
When you use preg_split with PREG_SPLIT_DELIM_CAPTURE i have different count of matches. 

Reproduce code:
---------------
<?php
$res1 = preg_split(
    '{((a|b)|c)}six',
    '--a--b--c--',
    0,
    PREG_SPLIT_DELIM_CAPTURE
);
var_export($res1);

Expected result:
----------------
array (
  0 => '--',
  1 => 'a',
  3 => '--',
  4 => 'b',
  6 => '--',
  7 => 'c',
  8 => '--',
)
OR:
array (
  0 => '--',
  1 => 'a', // All patterns
  2 => 'a', // First group
  3 => 'a', // Second group
  4 => '--',
  5 => 'b',
  6 => 'b',
  7 => 'b',
  8 => '--',
  9 => 'c', // Zero group
  10 => 'c', // First group
  11 => '', // Second group: None-match
  12 => '--',
)

Actual result:
--------------
array (
  0 => '--',
  1 => 'a',
  2 => 'a',
  3 => '--',
  4 => 'b',
  5 => 'b',
  6 => '--',
  7 => 'c',
  8 => '--',
)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-12-29 13:51 UTC] jani@php.net
To get the expected (OR) result, you need one more set of parenthesis:

{(((a|b)|c))}six

Then it includes all patterns..
 [2009-12-29 13:57 UTC] serovov at gmail dot com
No! it will return:
array (
  0 => '--',
  1 => 'a',
  2 => 'a',
  3 => 'a',
  4 => '--',
  5 => 'b',
  6 => 'b',
  7 => 'b',
  8 => '--',
  9 => 'c',
  10 => 'c',
  11 => '--',
)

So you always don't know number of matched elements and you can not normally use it; How i will know is it delimiter or delimited text ?
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 15 10:01:29 2025 UTC