php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #15413 preg_split drops characters
Submitted: 2002-02-06 19:27 UTC Modified: 2002-02-07 06:41 UTC
From: caribu at snafu dot de Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 4.0.6 OS: win
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: caribu at snafu dot de
New email:
PHP Version: OS:

 

 [2002-02-06 19:27 UTC] caribu at snafu dot de
$html = {some html containing mulitple instances of "<!-- [some text] --> some text <-- end -->}

$array = preg_split("'(?=<!--)(?!<!-- end -->)|(?<=\<!-- ende -->)'si", $html, -1, PREG_SPLIT_NO_EMPTY)

preg_split should split the string in such a way, that the html-part and the commented part are each assigned to one array entry.

preg_split does as expected, EXCEPT that the first character of the last array entry is disappears

Server API Apache Win2000

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-02-07 06:41 UTC] sander@php.net
Very likely to be a error in your regex. Ask support questions on the appropriate mailinglist.
 [2002-11-18 19:20 UTC] ianm at judcom dot nsw dot gov dot au
I also have experienced dropped characters in preg_split, and it is not an error in my regex. I can demonstrate the problem with a very simple regex, and I get correct results from Perl using the equivalent code.

The following code splits and re-joins a string - it should produce the original string but doesn't.

$foo = '(#11/19/2002#)';   // MS Access date constant
$bar = preg_split('/\b/',$foo);
$baz = join('',$bar);
print $baz;

result is: (#11/19/2002)

The trailing hash character (#) is being dropped.

I tried the equivalent program in Perl (see below) and it produces the *correct* result - so it is not a problem with the regex itself.

(Perl source:)
$foo = '(#11/19/2002#)';   # MS Access date constant
@bar = split(/\b/,$foo);
$baz = join('',@bar);
print $baz;

result is: (#11/19/2002#)

I have experienced this problem on Win32 with PHP 4.2.3 and on Linux with PHP 4.04pl1 (RH7.1). The equivalent Perl code works correctly on both Win32 (Perl 5.6.1) and Linux (Perl 5.6.0).
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 11:01:30 2024 UTC