|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2012-08-19 04:14 UTC] soapergem at gmail dot com
Description:
------------
Doesn't UTF-8 include basic ASCII characters, too? Right now when I try to encode the copyright symbol (©) using htmlentities (it should encode to ©), it doesn't work. I discovered this since the default encoding for htmlentities() was switched from ISO-8859-1 to UTF-8 in version 5.4.
I have plenty of places where I rely on basic symbols, such as the copyright symbol, being encoded properly with htmlentities(). Having to go in and change all the instances of htmlentities($string) to htmlentities($string, ENT_COMPAT | ENT_HTML401, 'ISO-8859-1') is not practical (there are MANY). And with the whole output of the function being blank, it just makes my scripts completely unusable now.
Help!
Test script:
---------------
<?php
echo htmlentities('©', ENT_COMPAT | ENT_HTML401, 'UTF-8');
?>
Expected result:
----------------
©
Actual result:
--------------
(Nothing - an empty string)
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Dec 05 03:00:01 2025 UTC |
From my command line: php > echo htmlentities('©', ENT_COMPAT | ENT_HTML401, 'UTF-8'); © it works fine. If you are actually providing the correct UTF-8 char it will work fine. You can verify that by doing this: php > $a = chr(0xC2).chr(0xA9); php > echo htmlentities($a, ENT_COMPAT | ENT_HTML401, 'UTF-8'); © Here I am explicitly passing C2A9 in and I get © back out. So I have no idea what your Windows Notepad is doing. Look at the output with a hex editor and see what it is converting that copyright character to.