php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #46478 htmlentities() uses obsolete mapping table for character entity references
Submitted: 2008-11-04 12:56 UTC Modified: 2009-12-22 05:50 UTC
From: for-bugs at hnw dot jp Assigned: moriyoshi (profile)
Status: Closed Package: Feature/Change Request
PHP Version: 5.2.6 OS: *
Private report: No CVE-ID: None
 [2008-11-04 12:56 UTC] for-bugs at hnw dot jp
Description:
------------
ext/standard/html.c has incorrect mapping table which htmlentities() uses.

html.c is based on http://www.unicode.org/Public/MAPPINGS/OBSOLETE/UNI2SGML.TXT, but this mapping table is obsolete and not compatible with HTML4.0 or XHTML1.0. For example, U+2235(which is encoded to "\xe2\x88\xb5" with UTF-8) is not in http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent, but htmlentities() returns "∵".

U+226A(≪) and U+226B(≫) are similler case.

Reproduce code:
---------------
<?php var_dump(htmlentities("\xe2\x88\xb5", ENT_QUOTES, "utf-8"));

Expected result:
----------------
string(3) "??"

Actual result:
--------------
string(8) "&becaus;"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-11-09 16:39 UTC] moriyoshi@php.net
I think this is a bug, but correcting the table should break BC too. 
 [2009-12-22 05:50 UTC] svn@php.net
Automatic comment from SVN on behalf of moriyoshi
Revision: http://svn.php.net/viewvc/?view=revision&revision=292467
Log: - Fix bug #46478 (htmlentities() uses obsolete mapping table for character
  entity references)
 [2009-12-22 05:50 UTC] moriyoshi@php.net
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 [2009-12-24 09:32 UTC] svn@php.net
Automatic comment from SVN on behalf of moriyoshi
Revision: http://svn.php.net/viewvc/?view=revision&revision=292588
Log: - MFB: Fix bug #46478 (htmlentities() uses obsolete mapping table for character
  entity references)
  (this should be gone to r292467)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Oct 15 01:01:29 2024 UTC