|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #55531 Vertical tabs ignored by XMLWriter
Submitted: 2011-08-29 15:52 UTC Modified: 2021-06-28 16:58 UTC
Avg. Score:5.0 ± 0.0
Reproduced:4 of 4 (100.0%)
Same Version:2 (50.0%)
Same OS:4 (100.0%)
From: Assigned: cmb (profile)
Status: Not a bug Package: XML Writer
PHP Version: 5.3.8 OS: Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
Solve the problem:
35 - 22 = ?
Subscribe to this entry?

 [2011-08-29 15:52 UTC]
When text contains vertical tabs, XMLWriter silently ignores them and generates invalid XML. This is not an issue where the text is invalid UTF-8. It is valid UTF-8 data. Vertical tabs are simply not allowed in XML by rule. I would expect XMLWriter to encode it as it would any other character not allowed in XML. I suspect that 

Test script:

    $xml = new XmlWriter();
    $xml->startDocument('1.0', 'UTF-8');
    $xml->writeElement("test", "This data contains a \vvertical tab");
    $data = $xml->outputMemory(true);

    $sxml = simplexml_load_string($data);


Expected result:
Either an error or valid XML.

Actual result:
Invalid XML is silently created. For example, SimpleXML::addchild() throws a warning: SimpleXMLElement::asXML(): xmlEscapeEntities : char out of range when a vertical tab is present and the node is not added.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2011-08-29 16:01 UTC]
XML 1.0 this character is not valid. Not sure if we should allow it, .net 2.x 
allows them optionally using settings.CheckCharacters = true;.

But that|s something libxml deals with, PHP's xml extensions only rely on it to 
parse input or generate data.
 [2016-10-17 13:36 UTC]
-Package: XML related +Package: XML Writer
 [2021-04-01 14:42 UTC]
While \v is indeed invalid for XML 1.0, it is valid for XML 1.1.
We could entity-encode such characters for 1.0 documents, and that
likely wouldn't cause a BC break, but we could also leave that to
libxml2 (to my knowledge, handling of invalid characters is not
even documented by them).  But if we manually entity-encode, we
should make sure to keep that consistent (at least for XMLWriter,
but likely for all bundled XML extension as well).  Not sure.

And we should decide how to handle NUL bytes.  Currently, the
strings are truncated, what doesn't appear to be intentional.
 [2021-06-28 16:58 UTC]
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2021-06-28 16:58 UTC]
This issue is fixed as of libxml2 2.9.11:

    Warning: simplexml_load_string(): Entity: line 2: parser error : PCDATA invalid Char value 11

    Warning: simplexml_load_string(): <test>This data contains a vertical tab</test>

    Warning: simplexml_load_string():                            ^
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Wed Dec 07 13:03:50 2022 UTC