php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #24044 DOCTYPE breaks CDATA handling
Submitted: 2003-06-05 11:16 UTC Modified: 2003-06-06 05:59 UTC
From: theseer@php.net Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 4.3.2 OS: Linux
Private report: No CVE-ID: None
 [2003-06-05 11:16 UTC] theseer@php.net
Hi..!

The following code parses the same XML Structure and dumps it right away. The only difference is the 'missing' doctpye for the later one. Even though the output differs a lot ;)

I'm not exactly sure if this is a php or more a libxml bug. It used to work on php 4.3.1 with LibXML2 2.4.19, wheras it does not on 4.3.2 with libxml2 2.5.x

<?PHP

 $xml=<<<EOF
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
 <body>
  some markup...
  <script>//<![CDATA[ .. some js code ]]></script>
  some more markup
 </body>
</html>
EOF;

 $dom=domxml_open_mem($xml);
 echo $dom->dump_mem(true, 'UTF-8');

 $xml2=<<<EOF
<?xml version="1.0" encoding="iso-8859-1"?>
<html xmlns="http://www.w3.org/1999/xhtml">
 <body>
  some markup...
  <script>//<![CDATA[ .. some js code ]]></script>
  some more markup
 </body>
</html>
EOF;

 $dom=domxml_open_mem($xml2);
 echo $dom->dump_mem(true, 'UTF-8');


?>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-06-05 11:29 UTC] theseer@php.net
Forgot to mention what is actually 'wrong' :>

The // in the script block gets wrapped - <![CDATA[//]]> - which is not an expected (automated) behavior and wrong.
 [2003-06-05 11:56 UTC] theseer@php.net
Just cross-checked with the DTD for XHTML 1.0 Trans:

<!-- style info, which may include CDATA sections -->
<!ELEMENT style (#PCDATA)>
<!ATTLIST style
  %i18n;
  id          ID             #IMPLIED
  type        %ContentType;  #REQUIRED
  media       %MediaDesc;    #IMPLIED
  title       %Text;         #IMPLIED
  xml:space   (preserve)     #FIXED 'preserve'
  >

The <script> MAY have a CDATA but it's not required to. Thus the automatic addtion is a bug. ( imho )
 [2003-06-05 11:56 UTC] theseer@php.net
Argls.. Wrong paste ;)

<!-- script statements, which may include CDATA sections -->
<!ELEMENT script (#PCDATA)>
<!ATTLIST script
  id          ID             #IMPLIED
  charset     %Charset;      #IMPLIED
  type        %ContentType;  #REQUIRED
  language    CDATA          #IMPLIED
  src         %URI;          #IMPLIED
  defer       (defer)        #IMPLIED
  xml:space   (preserve)     #FIXED 'preserve'
  >
 [2003-06-06 00:50 UTC] sniper@php.net
libxml bug. Report it to them..

 [2003-06-06 05:53 UTC] theseer@php.net
Moving bugreport to libXML - Closing
 [2003-06-06 05:59 UTC] sniper@php.net
still not php bug..

 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Wed Nov 25 22:01:24 2020 UTC