php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #25883 htmlentities($value, ..., "UTF-8") doesn't to UTF-8 correctly and completely
Submitted: 2003-10-15 10:21 UTC Modified: 2003-10-16 22:44 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: thomas at haeber dot de Assigned:
Status: Not a bug Package: Strings related
PHP Version: 4.3.1 OS: SuSE Linux 8.2 (2.4.20)
Private report: No CVE-ID: None
 [2003-10-15 10:21 UTC] thomas at haeber dot de
Description:
------------
Hi guys,

I used the htmlentities-function for converting values of different Languages to UTF-Code - see example 1.
But PHP doesn't convert these accent-letters to UTF-8-Code.

Furthermore I've realized that PHP doesn't converts the ?-Symbol to ° but to ° - see example 2.




Reproduce code:
---------------
// example 1:

$value = "LA R?CR?"; // with accent-letters of french language
$value = htmlentities($value, ENT_QUOTES, "UTF-8");
echo $value;
// Output: LA R?CR? (not LA RÉCRÉ)



// example 2:

$value = "13?";
$value = htmlentities($value, ENT_QUOTES, "UTF-8");
echo $value;
// OUTPUT: 13° (not 13°)

Expected result:
----------------
LA R?CR?

13°

Actual result:
--------------
htmlentities (UTF-8) converts ? wrongly to °.
htmlentities doesn't convert french accent-letters.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-10-15 19:08 UTC] iliaa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Whenever possible PHP will prefer html entities over &#[number];
 [2003-10-16 11:09 UTC] thomas at haeber dot de
Hi,

I don't know why this shoudn't be a bug, but when you say it, it is so.
Can I ask you, when i can count on php will recognize this issue so i can use htmlentities($String, ..., "UTF-8") to convert ? to ° instead of °, or when this function will convert these french letters to them UTF-8-equivalent?
Otherwise i have to use an ugly workaround, which could be unneccessary in the future.

THX
Thomas
 [2003-10-16 17:40 UTC] wez@php.net
The "utf-8" parameter specifies that your input string is already utf-8 encoded, so that it doesn't incorrectly translate bytes to entities.
htmlentities() does not perform codeset conversions,
and does not convert to numeric entities.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 26 03:01:32 2024 UTC