php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #51785 No way to escape quotes for XPath
Submitted: 2010-05-10 18:43 UTC Modified: 2010-06-18 18:53 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: pecoes at gmail dot com Assigned: rrichards (profile)
Status: Not a bug Package: *XML functions
PHP Version: 5.3.2 OS: WinXP
Private report: No CVE-ID: None
 [2010-05-10 18:43 UTC] pecoes at gmail dot com
Description:
------------
There seems to be no way to escape single or double quotes for XPath-Queries.

given: <test>"</test>

/test[text()="\""] produces an error message
/test[text()="\\""] dito
/test[text()="&quot;"] finds no match

This is not a PHP-Bug, I suppose. It may be a bug in the libxml2. It might even be a bug in the XPath Spec itself. But regardless of where the blame lies: This is serious! How is one supposed to use user-input in an XPath, if it cannot be escaped?

I found a work-around, but it's fugly:

$dom = new DOMDocument;
$dom->loadXML('<test>"</test>');
$xpath = new DOMXPath($dom);

function xquote ($str)
{
    if (strpos($str, '"') === FALSE) {
        return '"'.$str.'"';
    }
    if (strpos($str, "'") === FALSE) {
        return "'".$str."'";
    }
    $parts = preg_split('/(")/', $str, 0, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
    array_walk($parts,
        function (&$val) {
            if ($val == '"') $val = "'\"'";
            else $val = '"'.$val.'"';
        }
    );
    return 'concat('.implode(',', $parts).')';
}

$q = sprintf('/test[text()=%s]', xquote('"'));
if ($xpath->evaluate($q)->item(0)) {
    echo 'found'; // works!
} else {
    echo 'not found';
}

Test script:
---------------
$dom = new DOMDocument;
$dom->loadXML('<test>"</test>');
$xpath = new DOMXPath($dom);

$q = '/test[text()="&quot;"]';
if ($xpath->evaluate($q)->item(0)) {
    echo "found\r\n";
} else {
    echo "not found\r\n";
}

$q = '/test[text()="\\""]';
if ($xpath->evaluate($q)->item(0)) {
    echo "found\r\n";
} else {
    echo "not found\r\n";
}

Expected result:
----------------
found
found

Actual result:
--------------
not found
Warning: DOMXPath::evaluate(): Invalid predicate...
Warning: DOMXPath::evaluate(): Invalid expression...
Fatal error: Call to a member function item() on non-object...

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-05-12 08:22 UTC] mike@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: rrichards
 [2010-06-18 16:22 UTC] rrichards@php.net
-Status: Assigned +Status: Bogus
 [2010-06-18 16:22 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

You need to take into account PHP string escaping too.
$q = "/test[text()='\"']";
For more complex situations with mixed quote types, its a general overall issue 
with XPath not a PHP bug.
 [2010-06-18 16:33 UTC] pecoes at gmail dot com
Alright. It's not a PHP bug. So... what now? How do I deal with it in PHP? Just because PHP is innocent, doesn't mean there's no need for a fix. It's still a bug! Classifying it as "bogus" won't do a thing.
 [2010-06-18 16:50 UTC] rrichards@php.net
Jeez. Learn to properly escape strings then. I even gave you the proper code for 
your test to work. Its not a PHP bug nor a libxml2 bug so it's bogus. Regardless 
of the language you use you will hit escaping issues. If you really think its a 
bug somewhere you need to take it to the W3C.
 [2010-06-18 17:05 UTC] pecoes at gmail dot com
We seem to misunderstand each other...

As long as there's only one type of quote - single or double - there's no problem, but how do I escape a string with mixed quotes? How do I quote that, so that the XPath-engine won't reject it?
 [2010-06-18 18:08 UTC] rrichards@php.net
simplest way is to use php functions for comparison, like compare 
htmlspecialchars escaped strings:

$dom = new DOMDocument;
$domstr = "<test>double quote: \", single quote: '</test>";
$dom->loadXML($domstr);
$xpath = new DOMXPath($dom);

$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPHPFunctions();

$check_string = htmlspecialchars("double quote: \", single quote: '", ENT_QUOTES
);

$q = "/test[php:functionString('htmlspecialchars', ., 3) = '$check_string']";

echo $q."\n";
if ($xpath->evaluate($q)->item(0)) {
    echo "found\r\n";
} else {
    echo "not found\r\n";
}

There is no current plan to support XPath 2.0 although possibility of supporting 
xpath variables in a future PHP version
 [2010-06-18 18:53 UTC] pecoes at gmail dot com
Nice! Your work-around is certainly better than mine. :)

It's still a work-around, though. :(

XPath variables would certainly be useful.

My suggestion would have been to take unilateral action and improve the XPath standard by intoducing escape-sequences: \' \" and \\
I realize that amending a standard isn't exactly elegant, but it certainly would make things easy on the PHP-side of things. Simply treat your input with addslashes and you're good. From a user-perspective that would be the most desirable solution, I suppose.
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Sat Dec 03 02:05:54 2022 UTC