|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #28357 The htmlentities functions should detect already converted characters
Submitted: 2004-05-11 05:47 UTC Modified: 2004-05-11 05:55 UTC
From: johnfivealive at hotmail dot com Assigned:
Status: Not a bug Package: Strings related
PHP Version: 5.0.0RC2 OS: Fedora Core 1
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: johnfivealive at hotmail dot com
New email:
PHP Version: OS:


 [2004-05-11 05:47 UTC] johnfivealive at hotmail dot com
The htmlentities function does not attempt to detect whether or not a character is already part of a character entity or not. For example, the character '&' is usually represented as '&' in valid XHTML and XML markup. If one calls htmlentities on the string "&", the function returns "&". I believe this to be a bug. This function should look for these types of cases where things like this can go wrong when converting characters to their entity representations. I imagine this would only need to be done for the entity beginning with an '&' character.

This is annoying because consider the following situation. One properly escapes and converts all characters before inserting the XHTML/XML into a database, then one pulls out that data to be displayed in an HTML <textarea></textarea> field. One would usually call htmlentities() on the content to be displayed in the textarea so everything is rendered correctly by the browser and the markup is valid. Well in this case if any of the content in the textarea contains the  "&amp;" entity, then it will suffer from the bug mentioned above, i.e. all '&' characters will show up as "&amp;", this is because the underlying html code would look something like this as a result of the htmlentities function being called:

&lt;em&gt;emphasis tags are escaped correctly&lt;/em&gt;
&lt;br /&gt;
&lt;br /&gt;
but, the &amp;amp; character is not

This would render in a textarea as:

<em>emphasis tags are escaped correctly</em>
<br />
<br />
but, the &amp; character is not

See my frusturation?

Reproduce code:
echo htmlentities( "&amp;" );

Expected result:
The above code should detect the entity and return the correct string: "&amp;" instead of "&amp;amp;"

Actual result:
The above code returns "&amp;amp;" which it probably should not


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2004-05-11 05:55 UTC] johnfivealive at hotmail dot com
Sorry, i just realized this is not a bug!! because displaying something like the '&' character in an html textarea if the character appears as "&amp;" in the database itself then it should also appear as "&amp;" in the textarea, so the current behavior is the correct behavior! Sorry for the false alarm.
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon May 20 16:01:35 2024 UTC