php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #70450 DOM::loadHTML() --> ::saveXML() entity encodes CR
Submitted: 2015-09-07 14:00 UTC Modified: 2015-09-07 14:24 UTC
From: flavio dot cambraia at yahoo dot com dot br Assigned:
Status: Open Package: DOM XML related
PHP Version: 5.5.29 OS: Windows
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2015-09-07 14:00 UTC] flavio dot cambraia at yahoo dot com dot br
Description:
------------
The output shows an entity 
 for every line break in $contents var.
I am using PHP 5.5.8 build date Jan 8 2014 15:26:26


Test script:
---------------
<?php
$contents = '<!DOCTYPE html>
<html lang="pt"><head><meta charset="utf-8"></head>
<body>
<div class="div_entry">
<div class="div_imagem">
<a href="test.html">	<img src="test.png" alt="" />Link to imagem</a>
</div>
<div class="div_product">This is a title for a product</div>
<div class="div_price">R$50,00</div>
</div>';
$doc     = new DOMDocument();
$doc->loadHTML('<?xml encoding="UTF-8">' . $contents);
$xpath         = new DOMXpath($doc);
$xquery = '//div[@class="div_entry"]';
$articles  = $xpath->query($xquery);
$registros = array();
foreach ($articles as $i => $article) {  $registros[] = $article->ownerDocument->saveXML($article); } // end foreach  
echo "<pre>"; print_r($registros); echo "</pre>";
?>


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-09-07 14:24 UTC] cmb@php.net
-Summary: DOMXpath adds &#13; to string +Summary: DOM::loadHTML() --> ::saveXML() entity encodes CR
 [2015-09-07 14:24 UTC] cmb@php.net
This has nothing to do with XPath. Rewriting the code to use
"classic" DOM methods yields the same result.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Sat Jul 04 16:01:27 2020 UTC