php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77894 DOMNode::C14N() very slow on generated DOMDocuments even after normalisation
Submitted: 2019-04-15 10:19 UTC Modified: 2023-06-08 17:50 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: luxian dot m at gmail dot com Assigned: nielsdos (profile)
Status: Closed Package: DOM XML related
PHP Version: 7.3.4 OS:
Private report: No CVE-ID: None
 [2019-04-15 10:19 UTC] luxian dot m at gmail dot com
Description:
------------
Calling DOMNode::C14N() is way slower on DOMDocument objects that are created on the fly compared to DOMDocuments loaded from a string file.

Calling DOMDocument::normalizeDocument() doesn't make a difference despite the documentation stating: "This method acts as if you saved and then loaded the document, putting the document in a "normal" form."


But in the end it's still way faster to get the XML string and load it in a new DOMDocument - which is counter intuitive.

Code to demonstrate this can be found here:
https://gist.github.com/Luxian/1c732d13c12ca03835828a1553c39e4f 
https://3v4l.org/1fFB3 (limited to 200 items to not abuse the platform)

If you run the example code with 500 items you should get something like this:

Testing with 500 items
Generated DOM… 2.56458 seconds
Generated DOM with normalizeDocument()… 2.60349 seconds
Export and re-import DOM… 0.06695 seconds


Test script:
---------------
https://gist.github.com/Luxian/1c732d13c12ca03835828a1553c39e4f 

Expected result:
----------------
$oldDom->normalizeDocument() 
// should be the same as
$newDom->loadXML($oldDom->saveXML());

Actual result:
--------------
40x worse performance when calling C14N() 

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2023-06-08 17:50 UTC] nielsdos@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: nielsdos
 [2023-06-08 17:50 UTC] nielsdos@php.net
The fix for this bug has been committed.
If you are still experiencing this bug, try to check out latest source from https://github.com/php/php-src and re-test.
Thank you for the report, and for helping us make PHP better.

The main bottleneck was the namespace management. The test code now shows the performance of the different strategies is now almost the exact same. Although there's still more room for improvement in the C14N code itself, this wasn't the main underlying issue here. I'll leave the C14N improvement itself for issue #53655 which already hints at an idea to improve performance.
The fix will be in the 8.3 release.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 14:01:32 2024 UTC