php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78877 LIBXML_PARSEHUGE output formatting
Submitted: 2019-11-28 13:55 UTC Modified: 2021-09-15 10:27 UTC
From: kieran at miami-nice dot co dot uk Assigned: cmb (profile)
Status: Closed Package: DOM XML related
PHP Version: 7.3.12 OS:
Private report: No CVE-ID: None
 [2019-11-28 13:55 UTC] kieran at miami-nice dot co dot uk
Description:
------------
I added LIBXML_PARSEHUGE to a rather large test suite and it instantly changed the formatting of a lot of tests. 

I thought it could be related to https://bugs.php.net/bug.php?id=76738 but cmb linked me to https://bugs.php.net/bug.php?id=76738

I think it's a possible PHP bug because the output doesn't seem to change on Python https://repl.it/repls/InfantileMiserableDemo but I'm not sure.

LIBXML_PARSEHUGE actually looks the more correct output, as it matches the input string.

Test script:
---------------
https://3v4l.org/d4RSn


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-11-28 14:12 UTC] cmb@php.net
Ah, I thought it was about the differences between PHP < 7.3.0 and
>= 7.3.0!  If it's about the different output with and without
LIBXML_PARSEHUGE, it can't be related to the mentioned bug fixes,
since these have only been applied to PHP 7.3+.
 [2019-11-28 15:57 UTC] kieran at miami-nice dot co dot uk
> it's about the different output with and without LIBXML_PARSEHUGE

Correct
 [2019-12-14 16:49 UTC] drtechno at mail dot com
I would try uninstalling it and making sure it has a path assigned, then reinstall it.

like this:
Centos:
pip uninstall lxml
Ununtu:
sudo apt-get purge lxml
add the parsers to the PATH:
export PATH=/home/hsundara/packages_installed/libxslt-1.1.28/bin:/home/hsundara/packages_installed/libxml2-2.9.1/bin:$PATH
then install:
Centos:
pip install lxml --install-option="--auto-rpath"
Ubuntu:
sudo apt-get install lxml --install-option="--auto-rpath"
 [2021-09-15 09:12 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2021-09-15 09:12 UTC] cmb@php.net
I just run the test script with PHP-7.4 on Windows (libxml2
2.9.10), and there LIBXML_PARSEHUGE does not make a difference
(the output matches the input for either).  So this looks like a
libxml2 issue to me.  Can you check that with a more recent
version than libxml2 2.9.4?
 [2021-09-15 10:08 UTC] kieran at supportpal dot com
@cmb yeah this looks OK on Docker 7.4.23-cli-bullseye which is using libxml 2.9.10

Must have been a bug in libxml <2.9.10

Happy for you to close.
 [2021-09-15 10:27 UTC] cmb@php.net
-Status: Feedback +Status: Closed
 [2021-09-15 10:27 UTC] cmb@php.net
Thanks for the swift reply!  Closing then.
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Thu Oct 28 17:03:35 2021 UTC