php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #41208 saveXML() fails with iso-8859-1 input
Submitted: 2007-04-26 21:53 UTC Modified: 2007-04-27 09:27 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:1 (50.0%)
Same OS:1 (50.0%)
From: scott at realorganized dot com Assigned: rrichards (profile)
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.1 OS: Mac OS X, Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: scott at realorganized dot com
New email:
PHP Version: OS:

 

 [2007-04-26 21:53 UTC] scott at realorganized dot com
Description:
------------
When using input encoding of charset=iso-8859-1 domDocument() 
fails to output the ascii code 200 (e grave) using saveXML()

Reproduce code:
---------------
Here's some example code that shows the problem:

<?
  header("Content-Type: text/html; charset=iso-8859-1");
  error_reporting(E_ALL);
  $domfather = new domDocument('1.0', 'iso-8859-1');
  $node = $domfather->createElement("xxx", chr(200));
  $domfather->appendChild($node);
  echo "<pre>";
  echo htmlspecialchars($domfather->saveXML());
  $nodelist =  $domfather->getElementsByTagName("xxx");
  $data = $nodelist->item(0)->nodeValue;
  echo $data;
  echo strlen($data);
?>

Expected result:
----------------
I was expecting the saveXML() to output the e grave symbol (ascii 
200)

iso-8859-1 character mapping is here:

http://old.no/charmap/iso-8859-1.html

and shows that ascii is completely valid.

Please notice that the data is provided correctly when I ask for 
it by node.  the failure is just when using the saveXML() 
function.

Actual result:
--------------
The output from my server is below.  When I retrieve 
the node's data back, it is as expected.  But it's the saveXML() 
code that seems to have a problem. I suspect the problem is with 
the utf-8 -> iso-8859 conversion before output.

Warning:  DOMDocument::saveXML() [function.DOMDocument-saveXML]: 
output conversion failed due to conv error in /Library/Tenon/
WebServer/WebSites/realtyjuggler.com/subscription/test.php on line 
23

Warning:  DOMDocument::saveXML() [function.DOMDocument-saveXML]: 
Bytes: 0xC8 0x3C 0x2F 0x78 in /Library/Tenon/WebServer/WebSites/
realtyjuggler.com/subscription/test.php on line 23

<?xml version="1.0" encoding="iso-8859-1"?>
<xxx>?1

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-04-26 22:17 UTC] scott at realorganized dot com
change:
$domfather->saveXML()
to:
$domfather->saveXML($node)

in the example code and execute and no echo at all - no error, 
nothing.  It's like it crashed or something.
 [2007-04-26 23:45 UTC] tony2001@php.net
Rob, please verify this.
 [2007-04-27 09:27 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

You need to utf-8 encode/decode data: utf8_encode(chr(200))/utf-8_decode($data)
 [2010-09-13 23:04 UTC] r dot spliet at student dot tudelft dot nl
I'm suffering from the same "problem", with the exception that my ISO-8859-1 data comes from MySQL.
Using utf8_encode() does solve the issue, so I would like to turn this bug into a feature request. It seems a bit odd to me to encode my data in UTF-8 just to let DOMDocument decode it jiffy's later. I would personally like DOMDocument to know what type of data I'm presenting it instead of assuming UTF-8 (which is imho just wrong). Perhaps with a parameter in createElement, perhaps with an attribute. I'll leave that to the PHP engineers.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 16 06:01:30 2024 UTC