php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #9426 Form variables encoding problem
Submitted: 2001-02-23 12:45 UTC Modified: 2002-04-27 14:57 UTC
From: d dot sbragion at infotecna dot it Assigned:
Status: Duplicate Package: Feature/Change Request
PHP Version: 4.0.4pl1 OS: Linux 2.0
Private report: No CVE-ID: None
 [2001-02-23 12:45 UTC] d dot sbragion at infotecna dot it
When I enter some special chars in a textual form field (either 'INPUT TYPE="text"' or 'TEXTAREA') they get encoded like an html entitie. For example this '’' gets encoded as '’' in the variable of the form handling script (I hope this won't trigger the bug, the first char is like a '`' but "reversed", almost like a superscript small '/'). No coding happens for a plain typed '’', so there's no way to distinguish between the two cases in the form handling script and so there's no way to safely reverse the encoding. Browser is IE 5.5 on Windows 98.

This may happen for example doing cut & paste from WordPad, Word or existing web pages. I tried the same thing pasting into FrontPage Express. It encodes it as '’' instead of '’', may be it's just the encoding that's wrong.

P.S. Sorry for my poor English

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-02-23 12:51 UTC] d dot sbragion at infotecna dot it
Sorry but everything gets screwed because of the mixture of html entities and real chars. The char that gives problems is '’', the corresponding html entitie is ’, the html entitie provided by FrontPage is ’. Looking directly at the html code make it easier to understand what's going on.
 [2001-02-24 04:57 UTC] d dot sbragion at infotecna dot it
It turned out to be a problem with a:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

header that caused encoding by the browser prior to sendind data to PHP. Now there's another problem. The '’' doesn't get encoded by the htmlentities() function. This char, and others, is an illegal char according to the WDG html validator and should be encoded. I think an extended version of the htmlentities() function, which encodes every char that need encoding, not only the ones in the get_html_translation_table(HTML_ENTITIES) table, should be considered. Of course encoding should be performed in the '&#XXXX;' form.
 [2001-04-29 11:31 UTC] jmoore@php.net
feature/change request.


 [2002-04-27 14:57 UTC] jimw@php.net
duplicate of #7535.
 
PHP Copyright © 2001-2018 The PHP Group
All rights reserved.
Last updated: Wed Jul 18 06:01:24 2018 UTC