php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #6173 urlencode() doesn't respect locale
Submitted: 2000-08-15 12:21 UTC Modified: 2001-01-04 08:55 UTC
From: mbravo at acm dot org Assigned:
Status: Closed Package: Misbehaving function
PHP Version: 3.0.16 OS: FreeBSD 4.1
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: mbravo at acm dot org
New email:
PHP Version: OS:

 

 [2000-08-15 12:21 UTC] mbravo at acm dot org
By definition, urlencode() should leave alphanumeric characters unencoded. However, (uin my case) it doesn't respect locale settings, and encodes non-ASCII (but alphanumeric as per locale definition) characters when it shouldn't. Locale is correctly set up, and Apache does have correct LANG and LC_ALL environment variables in its runtime environment (checked by phpinfo()). I even tried executing explicit setlocale() call within a script but this doesn't change anything (which is probably correct as locale is already set systemwide)

I don't know if this problem is peculiar to FreeBSD installations, perhaps someone should check this out - might be possible, since judging by source code, system isalphanum() is used to determine whether a character should be encoded. However, FreeBSD in general handles locale very responsibly and this wouldn't be possible if fundamental checks like isalpha() were broken.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-01-04 08:55 UTC] hholzgra@php.net
this is intended behavior

see RFC 1738: Uniform Resource Locators (URL), Sectin 2.1:

   [...]
   No corresponding graphic US-ASCII:

   URLs are written only with the graphic printable characters of the
   US-ASCII coded character set. The octets 80-FF hexadecimal are not
   used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
   control characters; these must be encoded.
   [...]

 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC