php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78534 Query that contains '?' and 'string\' returns less records than must be
Submitted: 2019-09-13 12:54 UTC Modified: 2024-06-18 15:59 UTC
From: mnvx at yandex dot ru Assigned: mbeccati (profile)
Status: Closed Package: PDO PgSQL
PHP Version: 7.3.9 OS: Ubuntu 18.04
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: mnvx at yandex dot ru
New email:
PHP Version: OS:

 

 [2019-09-13 12:54 UTC] mnvx at yandex dot ru
Description:
------------
Script below returns one record (incorrect) instead of two (expected). This script executed directly in PostgreSQL console returns two rows (correct).

For thinking. After replacing 'CHACTECHNOLOGICO\\' to 'CHACTECHNOLOGICO' OR "?" to "-" in both places script returns two records (it is correct).

Discussion on stackoverflow: https://stackoverflow.com/questions/57858915


Test script:
---------------
<?php
$pdo = new \PDO('pgsql:host=localhost;dbname=postgres', 'postgres', 'postgres');
$sql = <<<SQL
SELECT *
FROM (
  SELECT 'CHAC TECHNOLOG*' as alias
  UNION SELECT 'KINDERY LIGHTING SALES DE?T*'
) m 
JOIN (
  SELECT 'CHACTECHNOLOGICO\\' as ie_clean
  UNION SELECT 'KINDERYLIGHTINGSALESDEPT' as ie_clean
) t ON t.ie_clean ILIKE REPLACE(REPLACE(REPLACE(m.alias, '*', '%'), '?', '_'), ' ', '') 
ORDER BY ie_clean;
SQL;

echo $sql . PHP_EOL . PHP_EOL;

$stmt = $pdo->query($sql);
print_r($stmt->fetchAll(PDO::FETCH_ASSOC));

Expected result:
----------------
Array
(
    [0] => Array
        (
            [alias] => CHAC TECHNOLOG*
            [ie_clean] => CHACTECHNOLOGICO\
        )

    [1] => Array
        (
            [alias] => KINDERY LIGHTING SALES DE?T*
            [ie_clean] => KINDERYLIGHTINGSALESDEPT
        )

)


Actual result:
--------------
Array
(
    [0] => Array
        (
            [alias] => CHAC TECHNOLOG*
            [ie_clean] => CHACTECHNOLOGICO\
        )

)


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-09-13 15:04 UTC] requinix@php.net
-Status: Open +Status: Analyzed
 [2019-09-13 15:04 UTC] requinix@php.net
PDO::query() uses the prepared statement system. When pdo_pgsql prepares the statement, it has the standard PDO query parser [1] transform the query to use named ($X) placeholders instead of positional (?). Problem is that PDO assumes backslashes escape quotes, so when it finds 'CHACTECHNOLOGICO\' it thinks the second quote was escaped. That effectively "inverts" what are strings and what are not in the rest of the query, thus the quoted '?' looks unquoted and it gets replaced to '$1'.

The query doesn't come with any parameters so PHP doesn't try to send them to the server, and the rewritten SQL is still valid as far as PostgreSQL is concerned, so there's nothing to trigger any errors to indicate a problem.

Illia's comment on SO about the question mark is right, however it's just a symptom of the real problem because PDO is smart enough to not transform question marks in strings  - or at least not in what it thinks are strings. The real problem is the backslash+quote that tricks the query parser into the wrong state, so removing the backslash makes the query work.

A second option is to break the backslash+quote apart, which works in this case because of the nature of the query:
  SELECT 'CHACTECHNOLOGICO\\ ' as ie_clean

A third option usable in the general case is to add a comment with another quote:
  SELECT 'CHACTECHNOLOGICO\\' as ie_clean -- '

The parser knows about and normally skips comments, however it doesn't recognize that one as such because it thinks it's parsing inside a string; the quote ends the string and the parser continues as it originally should have.


[1] https://github.com/php/php-src/blob/php-7.4.0beta2/ext/pdo/pdo_sql_parser.re#L44
 [2019-09-15 15:19 UTC] mnvx at yandex dot ru
Interesting observation from @John: https://stackoverflow.com/a/57943446/1920758
 [2021-03-25 13:29 UTC] mbeccati@php.net
-Assigned To: +Assigned To: mbeccati
 [2021-03-25 13:29 UTC] mbeccati@php.net
The bug is there, but returning less rows is just a side effect of the complex query with replace.

I think this smaller query better highlights the problem:

SELECT 'foo\' AS a, '?' AS b

PDO has a single parser for all the drivers, which detects "'foo\' AS a, '" being a string, thus the ? is outside and replaced as $1 placeholder, so the result is:

Array
(
    [0] => Array
        (
            [x] => foo\
            [y] => $1
        )

)

FYI I have a prototype fix in the works, but it requires allowing drivers to use custom parsers, which most likely will require an RFC.
 [2021-03-25 13:43 UTC] mbeccati@php.net
-Status: Analyzed +Status: Assigned
 [2024-05-20 08:15 UTC] mbeccati@php.net
This one will be fixed in https://wiki.php.net/rfc/pdo_driver_specific_parsers

I expect to open voting soon.
 [2024-06-18 15:59 UTC] mbeccati@php.net
-Status: Assigned +Status: Closed
 [2024-06-18 15:59 UTC] mbeccati@php.net
The fix for this bug has been committed.
If you are still experiencing this bug, try to check out latest source from https://github.com/php/php-src and re-test.
Thank you for the report, and for helping us make PHP better.

Fixed in https://github.com/php/php-src/commit/fbe317bf2179c65b750cc945c3530b28db1670e0

The fix will be part of 8.4
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 07 13:01:27 2024 UTC