|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #15413 preg_split drops characters
Submitted: 2002-02-06 19:27 UTC Modified: 2002-02-07 06:41 UTC
From: caribu at snafu dot de Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 4.0.6 OS: win
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
Solve the problem:
14 + 31 = ?
Subscribe to this entry?

 [2002-02-06 19:27 UTC] caribu at snafu dot de
$html = {some html containing mulitple instances of "<!-- [some text] --> some text <-- end -->}

$array = preg_split("'(?=<!--)(?!<!-- end -->)|(?<=\<!-- ende -->)'si", $html, -1, PREG_SPLIT_NO_EMPTY)

preg_split should split the string in such a way, that the html-part and the commented part are each assigned to one array entry.

preg_split does as expected, EXCEPT that the first character of the last array entry is disappears

Server API Apache Win2000


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2002-02-07 06:41 UTC]
Very likely to be a error in your regex. Ask support questions on the appropriate mailinglist.
 [2002-11-18 19:20 UTC] ianm at judcom dot nsw dot gov dot au
I also have experienced dropped characters in preg_split, and it is not an error in my regex. I can demonstrate the problem with a very simple regex, and I get correct results from Perl using the equivalent code.

The following code splits and re-joins a string - it should produce the original string but doesn't.

$foo = '(#11/19/2002#)';   // MS Access date constant
$bar = preg_split('/\b/',$foo);
$baz = join('',$bar);
print $baz;

result is: (#11/19/2002)

The trailing hash character (#) is being dropped.

I tried the equivalent program in Perl (see below) and it produces the *correct* result - so it is not a problem with the regex itself.

(Perl source:)
$foo = '(#11/19/2002#)';   # MS Access date constant
@bar = split(/\b/,$foo);
$baz = join('',@bar);
print $baz;

result is: (#11/19/2002#)

I have experienced this problem on Win32 with PHP 4.2.3 and on Linux with PHP 4.04pl1 (RH7.1). The equivalent Perl code works correctly on both Win32 (Perl 5.6.1) and Linux (Perl 5.6.0).
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 01 22:01:28 2024 UTC