php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #9029 preg_split inconsistency
Submitted: 2001-01-31 08:13 UTC Modified: 2001-02-03 07:42 UTC
From: anil at recoil dot org Assigned:
Status: Closed Package: PCRE related
PHP Version: 4.0.4pl1 OS: OpenBSD 2.8
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: anil at recoil dot org
New email:
PHP Version: OS:

 

 [2001-01-31 08:13 UTC] anil at recoil dot org
While trying to write a simple Scheme parser in PHP, we encountered this problem:

In perl, you can capture the delimiters passed to split() by enclosing the portion with brackets, in the regexp.  This does not work in PHP.

For example, here is the perl script, and its output:

#!/usr/bin/perl
use Data::Dumper;

$_ = '(define heshe (if (equal gender "Male") "He" "She")) ';
@b = split /([\(\)])/;
print Dumper(@b);

Output:
$VAR1 = '';
$VAR2 = '(';
$VAR3 = 'define heshe ';
$VAR4 = '(';
$VAR5 = 'if ';
$VAR6 = '(';
$VAR7 = 'equal gender "Male"';
$VAR8 = ')';
$VAR9 = ' "He" "She"';
$VAR10 = ')';
$VAR11 = '';
$VAR12 = ')';
$VAR13 = ' ';

But in PHP:

<? $a = '(define heshe (if (equal gender "Male") "He" "She")) ';
$b = preg_split('/([\(\)])/', $a);
var_dump($b); ?>

Outputs:
array(7) {
  [0]=>
  string(0) ""
  [1]=>
  string(13) "define heshe "
  [2]=>
  string(3) "if "
  [3]=>
  string(19) "equal gender "Male""
  [4]=>
  string(11) " "He" "She""
  [5]=>
  string(0) ""
  [6]=>
  string(1) " "
}

The brackets aren't captured, which makes the Scheme parsing a bit difficult :-)  There are other ways to do this, of course, but the preg_split() would be the simplest and the most efficient way to do it, as far as we can see.

I noticed that the PCRE libraries are only at version 3.0 in PHP, and that 3.4 is available from the main site.  I'll try updating to that, and see if it solves this.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-01-31 08:43 UTC] derick@php.net
You have to escape the \'s (as per hint from Wico de Leeuw)
 [2001-01-31 09:17 UTC] anil at recoil dot org
I'm afraid that doesn't seem to work either (I tried that, but neglected to mention it in my original report)

[avsm@total avsm]$ php -v
4.0.4pl1
[avsm@total avsm]$ cat split.php
<?
$a = '(define heshe (if (equal gender "Male") "He" "She")) ';
$b = preg_split('/([\\(\\)])/', $a);
var_dump($b);
?>
[avsm@total avsm]$ php split.php
X-Powered-By: PHP/4.0.4pl1
Content-type: text/html

array(7) {
  [0]=>
  string(0) ""
  [1]=>
  string(13) "define heshe "
  [2]=>
  string(3) "if "
  [3]=>
  string(19) "equal gender "Male""
  [4]=>
  string(11) " "He" "She""
  [5]=>
  string(0) ""
  [6]=>
  string(1) " "
}

 [2001-02-03 07:42 UTC] avsm@php.net
Functionality added in 4.0.5-dev by andrei@
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Apr 20 13:01:29 2024 UTC