php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #49325 Error in domdocument->schemaValidate
Submitted: 2009-08-21 19:47 UTC Modified: 2009-10-05 18:40 UTC
From: bernardo at datamex dot com dot br Assigned: iekpo (profile)
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.10 OS: Freebsd 7
Private report: No CVE-ID: None
 [2009-08-21 19:47 UTC] bernardo at datamex dot com dot br
Description:
------------
erros na valida??o de xsd coisas do tipo

Error: Element '{http://www.portalfiscal.inf.br/nfe}IE': [facet 'pattern'] The value 'ISENTO' is not accepted by the pattern '[0-9]{0,14}|ISENTO|PR[0-9]{4,8}'.

o php esta em iso-8859-1 
o xml esta em utf-8

xsds in http://www.bernardosilva.com.br/NFe.rar
xml in http://www.bernardosilva.com.br/43090803116611000198550010000000010700000127.xml

Reproduce code:
---------------
$xml = new DomDocument();
$xml->load('43090803116611000198550010000000010700000127.xml')

$tempDom = new DOMDocument();
$tempDom->loadXML(utf8_encode($xml->saveXML()));

if ($tempDom->schemaValidate('nfe_v1.10.xsd'))
 echo "ok"
else
 echo "erro"
 

Expected result:
----------------
ok

Actual result:
--------------
erro

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-10-05 15:18 UTC] iekpo@php.net
I am going to take on this one.

I will provide feedback later in the day.
 [2009-10-05 16:07 UTC] iekpo@php.net
This was actually not a bug with the PHP code.

So this bug should be closed.

The error was in the instance XML document. 

The contents of the element node did not conform to what is specified in the XSD.

The original file is here :

http://israelekpo.com/php_bugs/NFe/43090803116611000198550010000000010700000127.xml

The corrected version is here :

http://israelekpo.com/php_bugs/NFe/43090803116611000198550010000000010700000127.correct.xml

PHP Code to verify success :

http://israelekpo.com/php_bugs/NFe/bug_49325.phps

<?php

$xml = new DomDocument();
$xml->load('43090803116611000198550010000000010700000127.correct.xml');

$tempDom = new DOMDocument();

$tempDom->loadXML(utf8_encode($xml->saveXML()));

if ($tempDom->schemaValidate('nfe_v1.10.xsd'))
{
    echo "ok";
    
} else {

    echo "erro";
}

?>

Expected result:
----------------
ok

Actual result:
--------------
ok

 [2009-10-05 16:56 UTC] bernardo at datamex dot com dot br
If you see the message below (generated by xsd validator php)

Error: Element '{http://www.portalfiscal.inf.br/nfe}IE': [facet
'pattern'] The value 'ISENTO' is not accepted by the pattern
'[0-9]{0,14}|ISENTO|PR[0-9]{4,8}'.

she says that "ISENTO" is not valid in expression '[0-9]{0,14}|ISENTO|PR[0-9]{4,8}'

if I use the same XSD and same xml in java is valid

the problem is not with the files
 [2009-10-05 18:16 UTC] iekpo@php.net
This looks like there is a problem with the regular expression in the XSD.

I took the regular expression '[0-9]{0,14}|ISENTO|PR[0-9]{4,8}' and attempted to validate it against the string "ISENTO" and it still did not work. However, 123456789 works.

I also attempted to validated the XML file directly with LIBXML2 and it still failed.

The Java library you are using is possibly using POSIX regular expression syntax.

The libxml2 library used in the validation process is very likely using PCRE.

PCRE is not necessary compatible with POSIX.

Which implies that even though it may work with the Java library, it does necessarily have to work with libxml2 which is what the DOM extension uses internally.

I would recommend that you tweek your regex to work with both regular expression types if that is possible.

Test it first with preg_match() and if it works with both POSIX and PCRE then update your XSD with the new regex.

I will update the documentation to make a note of this after I conclude my findings with the Java library and what regular expression syntax it uses in parsing the regex in XSD.

This is definitely not a bug in PHP.

Thank you for filing this bug report though and thank you for using PHP.

 [2009-10-05 18:40 UTC] bernardo at datamex dot com dot br
I agree with everything you wrote, my intention is not to prove that it is a PHP error, but to prove that this is a bug.

I do not know how the domxml works for sure, but thought that the validation could be responsible for libxml, and found that the error could be it.

Just to finish, expression is valid for both POSIX and for PCRE.

<?

if (preg_match('/^[0-9]{0,14}|ISENTO|PR[0-9]{4,8}$/', 'ISENTO')){
	echo 'ok<br />';
}else{
	echo 'erro<br />';
}

if (preg_match('/[0-9]{0,14}|ISENTO|PR[0-9]{4,8}/', 'ISENTO')){
	echo 'ok<br />';
}else{
	echo 'erro<br />';
}

if (ereg('^[0-9]{0,14}|ISENTO|PR[0-9]{4,8}$', 'ISENTO')){
	echo 'ok<br />';
}else{
	echo 'erro<br />';
}

if (ereg('[0-9]{0,14}|ISENTO|PR[0-9]{4,8}', 'ISENTO')){
	echo 'ok<br />';
}else{
	echo 'erro<br />';
}

?>

Result
-------------------------------
ok
ok
ok
ok

The error should be even in libxml, if you want to report the error to the team libxml ...

Even that may not want to change the regular expression as the XSDs are part of a project of the Brazilian government.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 15:01:28 2024 UTC