php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80277 parse_url & parse_str problems with values with percentage
Submitted: 2020-10-23 14:37 UTC Modified: 2020-10-23 19:45 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:0 of 1 (0.0%)
From: nnikolay at gmail dot com Assigned:
Status: Not a bug Package: URL related
PHP Version: 7.4.11 OS: Debian 10
Private report: No CVE-ID: None
 [2020-10-23 14:37 UTC] nnikolay at gmail dot com
Description:
------------
parse_url & parse_str have problems, when in the URL is some value with percent in the form:

domain.com/?campaignid=%campaignid%&adid=%bannerid%

Then both of them are converting the "%" in some unicode, which breaks the code:

array(4) {
  ["scheme"]=>
  string(5) "https"
  ["host"]=>
  string(19) "domain"
  ["path"]=>
  string(16) "/path/"
  ["query"]=>
  array(6) {
    ["campaignid"]=>
    string(10) "´┐Żmpaignid%"
    ["adid"]=>
    string(8) "´┐Żnnerid%"
  }
} 


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-10-23 15:09 UTC] cmb@php.net
-Status: Open +Status: Not a bug -Package: *General Issues +Package: URL related -Assigned To: +Assigned To: cmb
 [2020-10-23 15:09 UTC] cmb@php.net
Percent characters in an URI introduce a percent-encoded[1]
character, and parse_url() and parse_str() interpret it as such.
A literal percent character would have to be written as %25 in an
URI, see <https://3v4l.org/2C3Bo>.


[1] <https://en.wikipedia.org/wiki/Percent-encoding>
 [2020-10-23 15:22 UTC] nnikolay at gmail dot com
But this makes nonsense! Why should I convert the "%" to "%25" before I gave it to the function parse_url? Why the function parse_url & parse_str do not make it? If this is so, why I should not convert "&" to "%26" and all the other characters?
 [2020-10-23 15:33 UTC] cmb@php.net
-Status: Not a bug +Status: Open -Assigned To: cmb +Assigned To:
 [2020-10-23 15:33 UTC] cmb@php.net
> But this makes nonsense!

If you say so.
 [2020-10-23 15:42 UTC] nnikolay at gmail dot com
Of course, it is nonsense if you need to convert "%" but nothing else. Where did you have this in the documentation of parse_url? Or should we have some glass sphere to predict it? For me, it is a nonsense behavior of a function and of course a BUG!
 [2020-10-23 18:03 UTC] eveg at dbdh dot com
what about trying to understand how the basics are working before stubborn comments full of clueless?

the % has a special meaning - period
 [2020-10-23 18:22 UTC] nnikolay at gmail dot com
Then do something and comment it on your docu page before trying to make the other people clueless. If it's in the docu it is ok and makes sense, if not it is unexpected behavior and in each language, it is a BUG. Only because you think to know it, means not, that other people need to try all the variations of some function and to see what would be the results.
 [2020-10-23 18:35 UTC] girgias@php.net
-Status: Open +Status: Not a bug
 [2020-10-23 18:35 UTC] girgias@php.net
% has special semantics in URL see:

>>>    In addition, octets may be encoded by a character triplet consisting
   of the character "%" followed by the two hexadecimal digits (from
   "0123456789ABCDEF") which forming the hexadecimal value of the octet.
   (The characters "abcdef" may also be used in hexadecimal encodings.)

Source: Section 2.2. URL Character Encoding Issues of RFC 1738 - Uniform Resource Locators (URL)
http://www.faqs.org/rfcs/rfc1738.html
 [2020-10-23 18:40 UTC] nnikolay at gmail dot com
Super! Then add it please here: https://www.php.net/manual/en/function.parse-url.php and save next time someone half a day to look for a solution because if this is expected, it should be documented at the right place.
 [2020-10-23 18:58 UTC] requinix@php.net
parse_url does not parse the query string. It will return
"campaignid=%campaignid%&adid=%bannerid%".

parse_str *does* parse the query string. And the docs already say
> Note:
> All variables created (or values returned into array if second parameter is set)
> are already urldecode()d.
 [2020-10-23 19:07 UTC] eveg at ldbdh dot com
it's not the job of the docs for a programing language to educate you about absolute BASICS of the data you work with
 [2020-10-23 19:22 UTC] nnikolay at gmail dot com
Guys I don't want to speak with the whole php.net team about this unexpected behavior. When only "%" is a special character and everyone, who is starting to develop with PHP needs to learn every one of the RFC standards, then you need some break. The function parse_str() expects an $encoded_string, but there should be only "%" encoded - very helpful? How should someone know this, if not experiment with it and lose hours? 
Here the example: https://3v4l.org/oPCf5 with your docu suggestion for parse_str ( string $encoded_string [, array &$result ] ) : void
You can tell me everything, but the documentation is not clear and it is not clear, that we need to encode only "%" to "%25".
 [2020-10-23 19:43 UTC] djrhr at rgegez dot com
it's only unexpected for YOU the same way as others don't realize that they have to encode & as &amp; in a html link and just because everything works most of the time don't mean that specifications don't matter

stop whining and do your homework
 [2020-10-23 19:45 UTC] requinix@php.net
-Block user comment: No +Block user comment: Yes
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Wed Jan 20 14:01:23 2021 UTC