php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #36138 utf8_decode fails ; UTF-8 headers work
Submitted: 2006-01-23 21:09 UTC Modified: 2006-01-23 21:18 UTC
From: ceo at l-i-e dot com Assigned:
Status: Not a bug Package: *Languages/Translation
PHP Version: 4.4.2 OS: FreeBSD 5.3
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: ceo at l-i-e dot com
New email:
PHP Version: OS:

 

 [2006-01-23 21:09 UTC] ceo at l-i-e dot com
Description:
------------
I'm not an expert in multi-lingual character encoding sets stuff, but...

This version uses PHP's utf8_decode, and I get funky useless characters:
http://acousticdemo.com/info.com/answers/answers.php?qkw=Catherine+de+Medici&utf8_decode=1

This version sends a charset UTF-8 header (Mozilla based) and a META charset UTF-8 (MS IE) and it "works":
http://acousticdemo.com/info.com/answers/answers.php?qkw=Catherine+de+Medici

This was actually in 4.4.0 and also in 5.0.4, so it MIGHT be fixed in CVS -- but I have no control over the versions of PHP on the servers involved, and am unlikely to have such control in the future...  Best I can do, sorry.


Reproduce code:
---------------
http://acousticdemo.com/info.com/answers/answers.phps

You will not, however, be able to run the code on your own box, because your IP isn't allowed to get that content...

I can maybe hook you up with Answers.com, or I can do whatever it takes to get you the raw data for this one page, or...


Expected result:
----------------
I expected PHP's UTF8 decoding function to work as well as the browser's.


Actual result:
--------------
I get a lot of funky characters that just aren't right.

I can't guarantee that it's not the data that is bad, but the browser is handling it, so...


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-01-23 21:18 UTC] derick@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

utf8_decode converts utf8 characters to latin1. Many of the characters in that document do not exist in latin1 (iso-8859-1) and thus PHP returns question marks for this.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 16:01:28 2024 UTC