php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #27018 urlencode should not do non-ascii characters
Submitted: 2004-01-23 07:21 UTC Modified: 2004-01-24 08:43 UTC
From: vesely at tana dot it Assigned:
Status: Not a bug Package: URL related
PHP Version: Irrelevant OS: Irrelevant
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: vesely at tana dot it
New email:
PHP Version: OS:

 

 [2004-01-23 07:21 UTC] vesely at tana dot it
Description:
------------
Hi,
is it possible to reopen bug 6173?

Briefly, national characters are not field
separators in any url scheme. If they are
urlencoded, they may be traslated the wrong
way by users with incompatible code tables.

The answer to bug 6173 cites rfc1738, which is 10
years old and also says that

"   A mailto URL takes the form:
"      mailto:<rfc822-addr-spec>

The bug is relevant for urls like
mailto:user@example.com?Subject=use+national+chars+here
that already violate rfc1738.

I prepared a test page in
http://www.tana.it/urlencoded.html

The problem could be solved by adding a function to
support rfc1342, that must be called before rawurlencode.

Thank you for your patience
Ale

Reproduce code:
---------------
rawurlencode("? is not e")

Expected result:
----------------
%3D%3Fiso-8859-1%3FQ%3F%3DE8%3F%3D%20is%20not%20e

Actual result:
--------------
%E8%20is%20not%20e

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-01-23 17:05 UTC] moriyoshi@php.net
Use mb_encode_mimeheader() / iconv_mime_encode()

http://www.php.net/mb_encode_mimeheader
http://www.php.net/iconv_mime_encode.

just RTFM.


 [2004-01-24 08:43 UTC] vesely at tana dot it
iconv_mime_encode would be nearly fine for me
(until I don't use multy-byte) except that it
writes "Subject: blah" a la SMTP. I will have
to remove the leading "Subject: ", [raw]urlencode
the "blah" and append the result to the url,
after an "&amp;Subject=". And will I trust using
substr($iconverted,9) or should I use a regex
to match the colon?

Please... :-) Nasty as national chars in headers are,
if at least they could be used correctly life might
be better. And since much html is created using
PHP and url-functions, a well documented dedicated
function may improve overall conformancy. In facts
many programmers --I for one-- are not sure what is
the correct encoding of a mailto tags among the three
on my test page.

BTW, why configure doesn't include iconv automatically?

Thanks
Ale
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Oct 31 23:01:28 2024 UTC