php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #20929 Problem in handling big5 characters from HTML form
Submitted: 2002-12-10 21:43 UTC Modified: 2003-02-26 19:22 UTC
From: xp2002 at hkedcity dot net Assigned:
Status: Not a bug Package: Apache2 related
PHP Version: 4.2.3 OS: Redhat Linux 7.2
Private report: No CVE-ID: None
 [2002-12-10 21:43 UTC] xp2002 at hkedcity dot net
1. When I use $_REQUEST, $_POST, $_GET to retrieve the data of a HTML form, all "big5" characters will be changed to some HTML codes (e.g. "$#20806"). Although these HTML codes can be displayed correctly in browsers, I cannot convert them back to "big5" code. 

With PHP4.2.3 on Apache 1.3.27, there is no problem. PHP can retrieve "big5" characters from HTML form.


2. The "htmlentities" function does not correctly convert "big5" characters to html codes. Wrong html codes are generated and cannot be correctly displayed in browser.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-12-10 22:40 UTC] moriyoshi@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php4-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php4-win32-latest.zip
 [2002-12-11 19:36 UTC] xp2002 at hkedcity dot net
I have tried the latest PHP CVS 4.4.0-dev(200212120030).
The problem still exist. All "big 5" characters from HTML form POST/GET are always converted to HTML codes. No function can be used to convert HTML codes back to "big 5" characters.

Thanks.
 [2002-12-11 19:57 UTC] alan_k@php.net
2)htmlentities has an extra optional argument for characterset

Can you double check what that the brower is sending to the server (karpski), and see if theres any difference between you r Apache1.3 & Apache2.0 setup

 [2002-12-11 21:40 UTC] xp2002 at hkedcity dot net
I use the same client (Windows2000 + IE6), the same server OS (Redhat7.2) and the same php version (4.2.3 and 4.4.0-dev). The only difference is Apache version, Apache1.3.27 and Apache2.0.40. 

For testing, I use the "big 5" character "?O?}"
From Apache1.3.27:
$_REQUEST = "?O?}"

From Apache2.0.40:
$_REQUEST = "保良"


The setup of both Apache servers are equal.
Apache1.3.27:
'./configure' '--with-mysql' '--with-apxs=/usr/local/apache/bin/apxs' '--with-imap' '--with-kerberos' '--with-imap-ssl' '--with-gettext' '--with-xml' '--with-ldap' '--enable-ftp'

Apache2.0.40:
'./configure' '--with-mysql' '--with-apxs2=/usr/local/apache2/bin/apxs' '--with-imap' '--with-kerberos' '--with-imap-ssl' '--with-gettext' '--with-xml' '--with-ldap' '--enable-ftp' 


Thanks for help.
 [2002-12-11 22:13 UTC] alan at akbkhome dot com
It appears that apache 2 is correctly encoding the input, I would suggest a having a look at the multibyte extension, to see if there is a way of 
a) configuring php to automatically decode these for you
b) a routine to manually decode them.
 [2002-12-21 10:17 UTC] moriyoshi@php.net
I suspect some external input filter module automatically converts multibyte characters to htmlentities before they come into PHP's input handler.

If so, this is not the PHP developer issue.

What modules are enabled in Apache2? you can get the list of   built-in modules by the following option:

$ httpd -l


 [2002-12-22 19:01 UTC] xp2002 at hkedcity dot net
I have checked the built-in modules of both Apache 1.3 and Apache 2. They have the same modules:
Compiled in modules:
  core.c
  http_core.c
  mod_access.c
  mod_actions.c
  mod_alias.c
  mod_asis.c
  mod_auth.c
  mod_autoindex.c
  mod_cgi.c
  mod_dir.c
  mod_env.c
  mod_imap.c
  mod_include.c
  mod_log_config.c
  mod_mime.c
  mod_negotiation.c
  mod_setenvif.c
  mod_so.c
  mod_status.c
  mod_userdir.c
  prefork.c

Thanks.
 [2003-02-26 19:22 UTC] iliaa@php.net
Sorry, but your problem does not imply a bug in PHP itself.  For a
list of more appropriate places to ask for help using PHP, please
visit http://www.php.net/support.php as this bug system is not the
appropriate forum for asking support questions. 

Thank you for your interest in PHP.

External applications often will convert non-ascii characters to their htmlenties equivalents when pasting data into a web browser form(s). This has nothing to do with PHP itself.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue May 21 05:01:31 2024 UTC