php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50605 preg_split Wrong count of matches with PREG_SPLIT_DELIM_CAPTURE
Submitted: 2009-12-29 12:57 UTC Modified: 2009-12-29 13:57 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:2 (100.0%)
From: serovov at gmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.2.12 OS: *
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
41 - 8 = ?
Subscribe to this entry?

 
 [2009-12-29 12:57 UTC] serovov at gmail dot com
Description:
------------
When you use preg_split with PREG_SPLIT_DELIM_CAPTURE i have different count of matches. 

Reproduce code:
---------------
<?php
$res1 = preg_split(
    '{((a|b)|c)}six',
    '--a--b--c--',
    0,
    PREG_SPLIT_DELIM_CAPTURE
);
var_export($res1);

Expected result:
----------------
array (
  0 => '--',
  1 => 'a',
  3 => '--',
  4 => 'b',
  6 => '--',
  7 => 'c',
  8 => '--',
)
OR:
array (
  0 => '--',
  1 => 'a', // All patterns
  2 => 'a', // First group
  3 => 'a', // Second group
  4 => '--',
  5 => 'b',
  6 => 'b',
  7 => 'b',
  8 => '--',
  9 => 'c', // Zero group
  10 => 'c', // First group
  11 => '', // Second group: None-match
  12 => '--',
)

Actual result:
--------------
array (
  0 => '--',
  1 => 'a',
  2 => 'a',
  3 => '--',
  4 => 'b',
  5 => 'b',
  6 => '--',
  7 => 'c',
  8 => '--',
)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-12-29 13:51 UTC] jani@php.net
To get the expected (OR) result, you need one more set of parenthesis:

{(((a|b)|c))}six

Then it includes all patterns..
 [2009-12-29 13:57 UTC] serovov at gmail dot com
No! it will return:
array (
  0 => '--',
  1 => 'a',
  2 => 'a',
  3 => 'a',
  4 => '--',
  5 => 'b',
  6 => 'b',
  7 => 'b',
  8 => '--',
  9 => 'c',
  10 => 'c',
  11 => '--',
)

So you always don't know number of matched elements and you can not normally use it; How i will know is it delimiter or delimited text ?
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Apr 24 17:01:30 2024 UTC