php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #36138 utf8_decode fails ; UTF-8 headers work
Submitted: 2006-01-23 21:09 UTC Modified: 2006-01-23 21:18 UTC
From: ceo at l-i-e dot com Assigned:
Status: Not a bug Package: *Languages/Translation
PHP Version: 4.4.2 OS: FreeBSD 5.3
Private report: No CVE-ID: None
 [2006-01-23 21:09 UTC] ceo at l-i-e dot com
Description:
------------
I'm not an expert in multi-lingual character encoding sets stuff, but...

This version uses PHP's utf8_decode, and I get funky useless characters:
http://acousticdemo.com/info.com/answers/answers.php?qkw=Catherine+de+Medici&utf8_decode=1

This version sends a charset UTF-8 header (Mozilla based) and a META charset UTF-8 (MS IE) and it "works":
http://acousticdemo.com/info.com/answers/answers.php?qkw=Catherine+de+Medici

This was actually in 4.4.0 and also in 5.0.4, so it MIGHT be fixed in CVS -- but I have no control over the versions of PHP on the servers involved, and am unlikely to have such control in the future...  Best I can do, sorry.


Reproduce code:
---------------
http://acousticdemo.com/info.com/answers/answers.phps

You will not, however, be able to run the code on your own box, because your IP isn't allowed to get that content...

I can maybe hook you up with Answers.com, or I can do whatever it takes to get you the raw data for this one page, or...


Expected result:
----------------
I expected PHP's UTF8 decoding function to work as well as the browser's.


Actual result:
--------------
I get a lot of funky characters that just aren't right.

I can't guarantee that it's not the data that is bad, but the browser is handling it, so...


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-01-23 21:18 UTC] derick@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

utf8_decode converts utf8 characters to latin1. Many of the characters in that document do not exist in latin1 (iso-8859-1) and thus PHP returns question marks for this.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 01:01:28 2024 UTC