php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #18874 Certain Japanese characters break PHP's $_GET and $_POST
Submitted: 2002-08-13 00:44 UTC Modified: 2002-08-14 02:58 UTC
From: erica at simpli dot biz Assigned:
Status: Wont fix Package: Output Control
PHP Version: 4.1.2 OS: Windows XP
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: erica at simpli dot biz
New email:
PHP Version: OS:

 

 [2002-08-13 00:44 UTC] erica at simpli dot biz
I've put an example script here:

http://clients.simpli.biz/contact.php

and the .phps is

http://clients.simpli.biz/contact.phps

Basically, there is a specific Japanese character that translates to a ?[ in ASCII. Since this character contains a [, PHP treats this as the beginning of an array, and since it sees no ending delimiter, it ignores that form field. Any form input with that character then becomes broken in PHP. Since it is a fairly common character, this has a large impact.

Go to the script above and change your character encoding, and you'll see exactly what I mean. When you hit Submit, the script just prints out all submitted variables. Any form inputs with a name containing that character do not submit at all.

Thanks,
Erica
(SlashChick on irc.openprojects.net #php)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-08-13 01:05 UTC] kalowsky@php.net
Does this still happen with a newer version of PHP?
 [2002-08-13 02:07 UTC] erica at simpli dot biz
Still a bug in a fresh compile of 4.2.2 (on a friend's system.)

His phpinfo():
http://redline.worldforge.org/en/info.php

The script:
http://redline.worldforge.org/hottie/contact2.php

Thanks.
 [2002-08-13 02:15 UTC] rasmus@php.net
Actually, he probably meant the latest snapshot from snaps.php.net
 [2002-08-13 02:40 UTC] fujimoto@php.net
The spec of Shift_JIS (== one of the japanese encodings)
causes this problem. So, probably snapshots will not solve this.

As mentioned in http://www.php.net/manual/en/ref.mbstring.php,
using Shift_JIS as the script encoding or internal encoding is NOT recommended.

Anyway, currently we have two workarounds.

1. use EUC-JP as script encoding
2. use snapshots w/ following settings:
   - enable zend engine to parse shift jis
   ./configure --enable-zend-multibyte [and so on]

   - enable encoding translation
   add following lines to your php.ini

   [mbstring]
   mbstring.internal_encoding=euc-jp
   mbstring.output_encoding=shift_jis
   mbstring.script_encoding=shift_jis
   mbstring.encoding_translation=1

if you have questions about japanese encoding, please ask me
directly (because here is not suitable place to do that).

 [2002-08-13 05:20 UTC] wez@php.net
Also note that (for IE at least), you can force a more
sensible encoding to be used by adding an ACCEPTCHARSET
attribute to your form:

<form acceptcharset="utf-8" ...>

This will cause the data to be utf-8 encoded.
And you should definitely be running with the mbstring
module if you are dealing with Japanese text.
 [2002-08-14 02:58 UTC] yohgaki@php.net
Also note show_source() is the much better way to show sources.

.phps has other issues, also.
Use show_source() or like.

 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 11:01:30 2024 UTC