php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #79359 Literal checking
Submitted: 2020-03-09 13:30 UTC Modified: 2020-03-09 13:43 UTC
From: craig at craigfrancis dot co dot uk Assigned:
Status: Suspended Package: *General Issues
PHP Version: Next Major Version OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: craig at craigfrancis dot co dot uk
New email:
PHP Version: OS:

 

 [2020-03-09 13:30 UTC] craig at craigfrancis dot co dot uk
Description:
------------
Following up on the PHP Internals mailing list[1], and a similar idea by Matt Tait[2].

PHP should allow developers to check a variable was created from Literals.

By checking a variable `is_literal()`, it would allow us to enforce the use of parameterised SQL queries, at run time.

It would also be helpful for ORM's to ensure they don't introduce issues[3].

This is not the same as Taint Checking[4], as that allows you to use untaint(), and does not protect against issues like missing quotes:

  $sql = 'DELETE FROM ... WHERE id = ' . mysqli_real_escape_string($db, $_GET['id']);

  /delete.php?id=id

Note that string escaping is only "theoretically safe"[5] - typically due to character encoding issues.

And while SQL injection is easy to demonstrate, this can also protect against Command Line Injection, and to a certain extent, HTML Injection - as these would benefit from having a string known to be safe (made from literals), with user values being supplied separately.

Internally it would need to introduce a flag on every variable, and a single `is_literal()` function to check if a given variable has only been created by Literal(s). Unlike the taint extension, there should be no way to override this. And certain functions (e.g. mysqli_query) might use this information to generate a error/warning/notice in the future.

This is being discussed for JavaScript, via TC39 [6], to support the introduction of Trusted Types.

[1] https://news-web.php.net/php.internals/108537
    https://news-web.php.net/php.internals/106625
    https://news-web.php.net/php.internals/106631

[2] https://wiki.php.net/rfc/sql_injection_protection

[3] https://framework.zend.com/security/advisory/ZF2014-04
    https://framework.zend.com/security/advisory/ZF2016-03

[4] https://github.com/laruence/taint

[5] https://www.php.net/manual/en/pdo.quote.php

[6] https://github.com/tc39/proposal-array-is-template-object
    https://github.com/mikewest/tc39-proposal-literals

Test script:
---------------
<?php

    define('TABLE', 'example');

    $in_sql = substr(str_repeat('?,', count($ids)), 0, -1); // To create '?,?,?'

    $sql = 'SELECT * FROM ' . TABLE . ' WHERE id IN (' . $in_sql . ')';

  is_literal($sql); // Returns true

    $sql .= ' AND id = ' . mysqli_real_escape_string($db, $_GET['id']);

  is_literal($sql); // Returns false

?>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-03-09 13:33 UTC] nikic@php.net
-Status: Open +Status: Suspended
 [2020-03-09 13:33 UTC] nikic@php.net
Please continue this discussion on the internals list, this bug tracker is not suitable for extended discussions that require going through the RFC process.
 [2020-03-09 13:40 UTC] craig at craigfrancis dot co dot uk
As to how this could be used for HTML,

Start with the Template defined as a literal, and variables supplied separately:

<?php

  $template_html = '
    <p>Hello <span id="username">?</span></p>
    <p><a>Website</a></p>';

  $values = [
      '//span[@id="username"]' => [
          NULL      => 'Name',
          'class'   => 'admin',
          'data-id' => '123',
        ],
      '//a' => [
          'href' => 'https://example.com',
        ],
    ];

?>

Then the templating engine can do the necessary checks, and be certain in knowing that the HTML string itself is safe:

<?php

  function template_parse($html, $values) {

    if (!is_literal($html)) {
      throw new Exception('Invalid Template HTML.');
    }

    $dom = new DomDocument();
    $dom->loadHTML('<?xml encoding="UTF-8">' . $html);

    $xpath = new DOMXPath($dom);

    foreach ($values as $query => $attributes) {

      if (!is_literal($query)) {
        throw new Exception('Invalid Template XPath.');
      }

      foreach ($xpath->query($query) as $element) {
        foreach ($attributes as $attribute => $value) {

          if (!is_literal($attribute)) {
            throw new Exception('Invalid Template Attribute.');
          }

          if ($attribute) {
            $safe = false;
            if ($attribute == 'href') {
              if (preg_match('/^https?:\/\//', $value)) {
                $safe = true; // Not "javascript:..."
              }
            } else if ($attribute == 'class') {
              if (in_array($value, ['admin', 'important'])) {
                $safe = true; // Only allow specific classes?
              }
            } else if (preg_match('/^data-[a-z]+$/', $attribute)) {
              if (preg_match('/^[a-z0-9 ]+$/i', $value)) {
                $safe = true;
              }
            }
            if ($safe) {
              $element->setAttribute($attribute, $value);
            }
          } else {
            $element->textContent = $value;
          }

        }
      }

    }

    $html = '';
    $body = $dom->documentElement->firstChild;
    if ($body->hasChildNodes()) {
      foreach ($body->childNodes as $node) {
        $html .= $dom->saveXML($node);
      }
    }

    return $html;

  }

  echo template_parse($template_html, $values);

?>
 [2020-03-09 13:43 UTC] requinix@php.net
-Package: PHP Language Specification +Package: *General Issues
 [2020-03-23 16:49 UTC] craig at craigfrancis dot co dot uk
I've written this up as an RFC:

https://wiki.php.net/rfc/is_literal

And mentioned it on the internals list.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Mon Apr 28 11:01:30 2025 UTC