php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #53628 Lack of support for character references
Submitted: 2010-12-29 17:58 UTC Modified: 2010-12-29 23:18 UTC
From: alexander dot grimalovsky at gmail dot com Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.3.4 OS: All
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: alexander dot grimalovsky at gmail dot com
New email:
PHP Version: OS:

 

 [2010-12-29 17:58 UTC] alexander dot grimalovsky at gmail dot com
Description:
------------
DOM extension for PHP supports XML entity references by implementing DOMEntityReference class. However due to incorrect entity name validation this class only allows working with named entity references, not character references.

libxml2, which is used as backend implementation for DOM XML operations have 2 functions for creating entity references:
xmlNewReference() - for entity references, it is used by DOM extension
xmlNewCharRef() - for character references, it is not used by DOM extension and hence causes extension to lack support for this kind of entities.

Moreover, implementation of DOMEntityReference::__construct() in ext/dom/entityreference.c uses libxml2 function xmlValidateName() for validating entity name which checks for Name (see http://www.w3.org/TR/REC-xml/#NT-EntityRef). Of course this check is failed on character references ( see http://www.w3.org/TR/REC-xml/#NT-CharRef) and hence causes exception or warning error to be thrown.

Correct implementation should check for "#" character at a first position of given entity name and call xmlNewReference() or xmlNewCharRef() depending on test result.

PHP 5.2.x is also affected by this problem.

Test script:
---------------
<?php
$xml = new DOMDocument('1.0','utf-8');
$node = $xml->createElement('test');
$xml->appendChild($node);
$named = $xml->createEntityReference('entity');     // Create named entity, works
$node->appendChild($named);
$char = $xml->createEntityReference('#xAA');        // Create character reference, doesn't work
$node->appendChild($char);
echo $xml->saveXML();

Expected result:
----------------
<?xml version="1.0" encoding="utf-8"?>
<test>&entity;&#xAA;</test>

Actual result:
--------------
Fatal error: Uncaught exception 'DOMException' with message 'Invalid Character Error' in test.php:7
Stack trace:
#0 test.php(7): DOMDocument->createEntityReference('#xAA')
#1 {main}
  thrown in test.php on line 7

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-12-29 23:18 UTC] rrichards@php.net
-Status: Open +Status: Bogus
 [2010-12-29 23:18 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

createEntityReference works per spec. It is only supposed to support entity 
references - not character references. You typically use a text node with escaped 
data to add characters.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 15:01:29 2024 UTC