|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #30049 URL rewriting uses raw value of arg_separator.output, not HTML-escaped value
Submitted: 2004-09-10 11:00 UTC Modified: 2004-09-10 12:37 UTC
From: ibrash at gmail dot com Assigned:
Status: Not a bug Package: Session related
PHP Version: 5.0.1 OS: N/A
Private report: No CVE-ID: None
 [2004-09-10 11:00 UTC] ibrash at gmail dot com
The behavior described in the title is quite familiar to those who use the session extension and the transparent SID URL rewriting.  I'm aware that until now, the most often recommended solution for those seeking (X)HTML validity is to use & for arg_separator.output.  However, URL rewriting is no longer the only thing that uses this directive and it's not being used consistently.

In PHP 5, http_build_query uses this directive to create URL query strings.  The problem arises in that while the HTML representation of a URL might be, the URL itself is

It makes sense for the URL rewriting to use an HTML-escaped version of arg_separator.output (it's operating in an HTML context) while http_build_query uses the raw version (it's creating a generic URL query string).  Unfortunately, this represents a minor BC break for those who have set arg_separator.output to & as the HTML-escaped version of this is &. By far, the better workaround to have given these people would have been changing arg_separator.output to ; and arg_separator.input to &;.  Still, the PHP 5 line is young so this would probably be the best time to fix it and make URL rewriting use the HTML representation of arg_separator.output instead of the raw value.

Reproduce code:
ini_set('arg_separator.output', '&');
ini_set('session.use_only_cookies', 0);
ini_set('session.use_cookies', 0);
ini_set('session.use_trans_sid', 1);
ini_set('url_rewriter.tags', 'a=href,area=href,frame=src,input=src');
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
<html xmlns="" xml:lang="en" lang="en">
<p><a href="foo.php?<?php
$data = array('bar' => 42, 'baz' => '6x9');
print htmlentities(http_build_query($data));
?>">Sample Link</a></p>

Expected result:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
<html xmlns="" xml:lang="en" lang="en">
<p><a href="foo.php?bar=42&amp;baz=6x9&amp;PHPSESSID=qgdt7l0pef5ra4mrmuosth42ks88k77t">Sample Link</a></p>

Actual result:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
<html xmlns="" xml:lang="en" lang="en">
<p><a href="foo.php?bar=42&amp;baz=6x9&PHPSESSID=qgdt7l0pef5ra4mrmuosth42ks88k77t">Sample Link</a></p>

(Note the &PHPSESSID)


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2004-09-10 12:37 UTC]
There is no bug here.

if you set arg_separator.output to &, PHP will use &, but if you run htmlentities() or htmlspecialchars() it will be converted to &amp; (the session id isn't, because is appended later).
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Tue Jun 15 01:01:24 2021 UTC