|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2003-09-30 14:52 UTC] Bjorn dot Victor at it dot uu dot se
Description:
------------
Symptom:
html_entity_decode(""") returns '"', while the expected value would be """. Corresponding (wrong) behaviour for & followed by "lt;", "gt;" etc.
Another example is html_entity_decode(htmlentities("<")) which returns "<" rather than "<" as expected.
As a result, html_entity_decode can not be used as the inverse of htmlentities.
Diagnosis:
The function (php_unescape_html_entities in ext/standard/html.c) replaces each entity in basic_entities with its corresponding character, but starts by replacing "&" with "&", the resulting string being """, which is then replaced by '"'.
Solution:
php_unescape_html_entities in ext/standard/html.c traverses the basic_entities from the wrong end; it must replace "&" *last*, not *first*.
Reproduce code:
---------------
print html_entity_decode("&quot;&lt;&gt;");
Expected result:
----------------
"<>
Actual result:
--------------
"<>
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Oct 29 10:00:01 2025 UTC |
html_entity_decode(htmlentities("<")) returns "<", but IMHO it should return the original "<". The unhtmlentities() function given on http://www.php.net/html_entity_decode works like it should (in my eyes).