php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50253 internal encoding
Submitted: 2009-11-20 23:01 UTC Modified: 2009-11-30 01:00 UTC
From: kanea at free dot fr Assigned:
Status: No Feedback Package: DOM XML related
PHP Version: 5.2.6-1+lenny3 OS: linux
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2009-11-20 23:01 UTC] kanea at free dot fr
Description:
------------
I have the same problem with page from wikipedia. 

It seem that the loadhtml works with iso character in internal.

Same bug that bug #32547

Reproduce code:
---------------
this code works :
		$url="http://".$lang.".wikipedia.org/wiki/".$article;
		$this->dom=new DomDocument('1.0', 'UTF-8');
		$str=file_get_contents($url);
		$this->dom->loadXML($str);
		$this->contenu = $this->dom->saveXml();
this code don't works :
		$url="http://".$lang.".wikipedia.org/wiki/".$article;
		$this->dom=new DomDocument('1.0', 'UTF-8');
		$str=file_get_contents($url);
		$this->dom->loadHtml($str);
		$this->contenu = $this->dom->saveXml();
It seem that the loadhtml works with iso characters in internal.

Expected result:
----------------
Code with utf-8 encoded characters

Actual result:
--------------
Code with bad characters

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-11-20 23:23 UTC] kanea at free dot fr
I cannot test on another system
 [2009-11-22 21:44 UTC] jani@php.net
Please try using this snapshot:

  http://snaps.php.net/php5.2-latest.tar.gz
 
For Windows:

  http://windows.php.net/snapshots/


 [2009-11-30 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Wed Jan 26 15:03:33 2022 UTC