|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #15413 preg_split drops characters
Submitted: 2002-02-06 19:27 UTC Modified: 2002-02-07 06:41 UTC
From: caribu at snafu dot de Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 4.0.6 OS: win
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: caribu at snafu dot de
New email:
PHP Version: OS:


 [2002-02-06 19:27 UTC] caribu at snafu dot de
$html = {some html containing mulitple instances of "<!-- [some text] --> some text <-- end -->}

$array = preg_split("'(?=<!--)(?!<!-- end -->)|(?<=\<!-- ende -->)'si", $html, -1, PREG_SPLIT_NO_EMPTY)

preg_split should split the string in such a way, that the html-part and the commented part are each assigned to one array entry.

preg_split does as expected, EXCEPT that the first character of the last array entry is disappears

Server API Apache Win2000


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2002-02-07 06:41 UTC]
Very likely to be a error in your regex. Ask support questions on the appropriate mailinglist.
 [2002-11-18 19:20 UTC] ianm at judcom dot nsw dot gov dot au
I also have experienced dropped characters in preg_split, and it is not an error in my regex. I can demonstrate the problem with a very simple regex, and I get correct results from Perl using the equivalent code.

The following code splits and re-joins a string - it should produce the original string but doesn't.

$foo = '(#11/19/2002#)';   // MS Access date constant
$bar = preg_split('/\b/',$foo);
$baz = join('',$bar);
print $baz;

result is: (#11/19/2002)

The trailing hash character (#) is being dropped.

I tried the equivalent program in Perl (see below) and it produces the *correct* result - so it is not a problem with the regex itself.

(Perl source:)
$foo = '(#11/19/2002#)';   # MS Access date constant
@bar = split(/\b/,$foo);
$baz = join('',@bar);
print $baz;

result is: (#11/19/2002#)

I have experienced this problem on Win32 with PHP 4.2.3 and on Linux with PHP 4.04pl1 (RH7.1). The equivalent Perl code works correctly on both Win32 (Perl 5.6.1) and Linux (Perl 5.6.0).
PHP Copyright © 2001-2023 The PHP Group
All rights reserved.
Last updated: Wed Nov 29 00:01:26 2023 UTC