php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #51785 No way to escape quotes for XPath
Submitted: 2010-05-10 18:43 UTC Modified: 2010-06-18 18:53 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: pecoes at gmail dot com Assigned: rrichards (profile)
Status: Not a bug Package: *XML functions
PHP Version: 5.3.2 OS: WinXP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: pecoes at gmail dot com
New email:
PHP Version: OS:

 

 [2010-05-10 18:43 UTC] pecoes at gmail dot com
Description:
------------
There seems to be no way to escape single or double quotes for XPath-Queries.

given: <test>"</test>

/test[text()="\""] produces an error message
/test[text()="\\""] dito
/test[text()="&quot;"] finds no match

This is not a PHP-Bug, I suppose. It may be a bug in the libxml2. It might even be a bug in the XPath Spec itself. But regardless of where the blame lies: This is serious! How is one supposed to use user-input in an XPath, if it cannot be escaped?

I found a work-around, but it's fugly:

$dom = new DOMDocument;
$dom->loadXML('<test>"</test>');
$xpath = new DOMXPath($dom);

function xquote ($str)
{
    if (strpos($str, '"') === FALSE) {
        return '"'.$str.'"';
    }
    if (strpos($str, "'") === FALSE) {
        return "'".$str."'";
    }
    $parts = preg_split('/(")/', $str, 0, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
    array_walk($parts,
        function (&$val) {
            if ($val == '"') $val = "'\"'";
            else $val = '"'.$val.'"';
        }
    );
    return 'concat('.implode(',', $parts).')';
}

$q = sprintf('/test[text()=%s]', xquote('"'));
if ($xpath->evaluate($q)->item(0)) {
    echo 'found'; // works!
} else {
    echo 'not found';
}

Test script:
---------------
$dom = new DOMDocument;
$dom->loadXML('<test>"</test>');
$xpath = new DOMXPath($dom);

$q = '/test[text()="&quot;"]';
if ($xpath->evaluate($q)->item(0)) {
    echo "found\r\n";
} else {
    echo "not found\r\n";
}

$q = '/test[text()="\\""]';
if ($xpath->evaluate($q)->item(0)) {
    echo "found\r\n";
} else {
    echo "not found\r\n";
}

Expected result:
----------------
found
found

Actual result:
--------------
not found
Warning: DOMXPath::evaluate(): Invalid predicate...
Warning: DOMXPath::evaluate(): Invalid expression...
Fatal error: Call to a member function item() on non-object...

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-05-12 08:22 UTC] mike@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: rrichards
 [2010-06-18 16:22 UTC] rrichards@php.net
-Status: Assigned +Status: Bogus
 [2010-06-18 16:22 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

You need to take into account PHP string escaping too.
$q = "/test[text()='\"']";
For more complex situations with mixed quote types, its a general overall issue 
with XPath not a PHP bug.
 [2010-06-18 16:33 UTC] pecoes at gmail dot com
Alright. It's not a PHP bug. So... what now? How do I deal with it in PHP? Just because PHP is innocent, doesn't mean there's no need for a fix. It's still a bug! Classifying it as "bogus" won't do a thing.
 [2010-06-18 16:50 UTC] rrichards@php.net
Jeez. Learn to properly escape strings then. I even gave you the proper code for 
your test to work. Its not a PHP bug nor a libxml2 bug so it's bogus. Regardless 
of the language you use you will hit escaping issues. If you really think its a 
bug somewhere you need to take it to the W3C.
 [2010-06-18 17:05 UTC] pecoes at gmail dot com
We seem to misunderstand each other...

As long as there's only one type of quote - single or double - there's no problem, but how do I escape a string with mixed quotes? How do I quote that, so that the XPath-engine won't reject it?
 [2010-06-18 18:08 UTC] rrichards@php.net
simplest way is to use php functions for comparison, like compare 
htmlspecialchars escaped strings:

$dom = new DOMDocument;
$domstr = "<test>double quote: \", single quote: '</test>";
$dom->loadXML($domstr);
$xpath = new DOMXPath($dom);

$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPHPFunctions();

$check_string = htmlspecialchars("double quote: \", single quote: '", ENT_QUOTES
);

$q = "/test[php:functionString('htmlspecialchars', ., 3) = '$check_string']";

echo $q."\n";
if ($xpath->evaluate($q)->item(0)) {
    echo "found\r\n";
} else {
    echo "not found\r\n";
}

There is no current plan to support XPath 2.0 although possibility of supporting 
xpath variables in a future PHP version
 [2010-06-18 18:53 UTC] pecoes at gmail dot com
Nice! Your work-around is certainly better than mine. :)

It's still a work-around, though. :(

XPath variables would certainly be useful.

My suggestion would have been to take unilateral action and improve the XPath standard by intoducing escape-sequences: \' \" and \\
I realize that amending a standard isn't exactly elegant, but it certainly would make things easy on the PHP-side of things. Simply treat your input with addslashes and you're good. From a user-perspective that would be the most desirable solution, I suppose.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 16:01:28 2024 UTC