php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #9029 preg_split inconsistency
Submitted: 2001-01-31 08:13 UTC Modified: 2001-02-03 07:42 UTC
From: anil at recoil dot org Assigned:
Status: Closed Package: PCRE related
PHP Version: 4.0.4pl1 OS: OpenBSD 2.8
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: anil at recoil dot org
New email:
PHP Version: OS:

 

 [2001-01-31 08:13 UTC] anil at recoil dot org
While trying to write a simple Scheme parser in PHP, we encountered this problem:

In perl, you can capture the delimiters passed to split() by enclosing the portion with brackets, in the regexp.  This does not work in PHP.

For example, here is the perl script, and its output:

#!/usr/bin/perl
use Data::Dumper;

$_ = '(define heshe (if (equal gender "Male") "He" "She")) ';
@b = split /([\(\)])/;
print Dumper(@b);

Output:
$VAR1 = '';
$VAR2 = '(';
$VAR3 = 'define heshe ';
$VAR4 = '(';
$VAR5 = 'if ';
$VAR6 = '(';
$VAR7 = 'equal gender "Male"';
$VAR8 = ')';
$VAR9 = ' "He" "She"';
$VAR10 = ')';
$VAR11 = '';
$VAR12 = ')';
$VAR13 = ' ';

But in PHP:

<? $a = '(define heshe (if (equal gender "Male") "He" "She")) ';
$b = preg_split('/([\(\)])/', $a);
var_dump($b); ?>

Outputs:
array(7) {
  [0]=>
  string(0) ""
  [1]=>
  string(13) "define heshe "
  [2]=>
  string(3) "if "
  [3]=>
  string(19) "equal gender "Male""
  [4]=>
  string(11) " "He" "She""
  [5]=>
  string(0) ""
  [6]=>
  string(1) " "
}

The brackets aren't captured, which makes the Scheme parsing a bit difficult :-)  There are other ways to do this, of course, but the preg_split() would be the simplest and the most efficient way to do it, as far as we can see.

I noticed that the PCRE libraries are only at version 3.0 in PHP, and that 3.4 is available from the main site.  I'll try updating to that, and see if it solves this.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-01-31 08:43 UTC] derick@php.net
You have to escape the \'s (as per hint from Wico de Leeuw)
 [2001-01-31 09:17 UTC] anil at recoil dot org
I'm afraid that doesn't seem to work either (I tried that, but neglected to mention it in my original report)

[avsm@total avsm]$ php -v
4.0.4pl1
[avsm@total avsm]$ cat split.php
<?
$a = '(define heshe (if (equal gender "Male") "He" "She")) ';
$b = preg_split('/([\\(\\)])/', $a);
var_dump($b);
?>
[avsm@total avsm]$ php split.php
X-Powered-By: PHP/4.0.4pl1
Content-type: text/html

array(7) {
  [0]=>
  string(0) ""
  [1]=>
  string(13) "define heshe "
  [2]=>
  string(3) "if "
  [3]=>
  string(19) "equal gender "Male""
  [4]=>
  string(11) " "He" "She""
  [5]=>
  string(0) ""
  [6]=>
  string(1) " "
}

 [2001-02-03 07:42 UTC] avsm@php.net
Functionality added in 4.0.5-dev by andrei@
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Sun Feb 17 20:01:25 2019 UTC