|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2019-02-24 16:07 UTC] nicolas dot roeser at uni-ulm dot de
Description: ------------ --- From manual page: https://php.net/function.htmlspecialchars --- I have found function htmlspecialchars to behave differently than its documentation says. I do not know what is the intended behavior. Either the function or the documentation must be fixed. There are two main issues: 1) When neither ENT_COMPAT nor ENT_QUOTES nor ENT_NOQUOTES is set, the function is documented to default to ENT_COMPAT. It seems that it defaults to ENT_NOQUOTES instead. 2) When ENT_HTML401 and ENT_XML1 are set, the function is documented to give precedence to ENT_HTML401. It seems that it gives precedence to ENT_XML1 instead. I have created suitable and readable test scripts and would like to add them to this bug report, but as the test script input is limited to 20 lines, I had to convert my shorter script to a more unreadable version and add it. This bug report is not really related to #61498: this bug report here is about documented vs. actual behavior. Test script: --------------- <?php // The 20-line limit is crap. // Super short test script which only tests the default flags and failing // combinations (and only one per quot/apos group). function hs_expect($test_num, $expected, $str, $flags=null) { if ($flags === null) { $result = htmlspecialchars($str); } else { $result = htmlspecialchars($str, $flags, 'UTF-8', TRUE); } echo "Test number $test_num: "; if ($expected === $result) { echo 'OK'; } else { echo "FAIL: expected: >$expected<, result: >$result<"; } echo "\n"; } // ===== tests for quoting behavior ===== $s = '"'; hs_expect(-1, '"', $s); hs_expect( 0, '"', $s, 0); hs_expect( 8, '"', $s, ENT_HTML401); hs_expect(16, '"', $s, ENT_XML1); hs_expect(32, '"', $s, ENT_XHTML); hs_expect(64, '"', $s, ENT_HTML5); // ===== tests for markup language selection ===== $s = "'"; hs_expect(-1, "'", $s); hs_expect(26, "'", $s, ENT_QUOTES | ENT_HTML401 | ENT_XML1); Expected result: ---------------- I expect documentation and actual behaviour to be in sync. I do not care which one is fixed (or both). Actual result: -------------- See description above. PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Tue Oct 28 10:00:01 2025 UTC |
ENT_NOQUOTES=0 and ENT_HTML401=0 so they cannot be detected, thus any mention of behavior when those are missing cannot be correct. I suspect the note was trying to explain the default if $flags is omitted. $ php -r 'print_r(get_defined_constants());' | egrep '\bENT_' [ENT_COMPAT] => 2 [ENT_QUOTES] => 3 [ENT_NOQUOTES] => 0 [ENT_IGNORE] => 4 [ENT_SUBSTITUTE] => 8 [ENT_DISALLOWED] => 128 [ENT_HTML401] => 0 [ENT_XML1] => 16 [ENT_XHTML] => 32 [ENT_HTML5] => 48 Note: - All flags except QUOTES are single bits - NOQUOTES is implicitly the default quoting behavior flag - HTML401 is implicitly the default doctype flag When combining ENT_COMPAT/QUOTES/NOQUOTES, - NOQUOTES(0) | COMPAT(2) = COMPAT(2) - NOQUOTES(0) | QUOTES(3) = QUOTES(3) - COMPAT(2) | QUOTES(3) = QUOTES(3) In effect, QUOTES (double+single) has precedence over COMPAT (double) has precedence over NOQUOTES (neither). Which should make sense. For the doctype flags, XML1/XHTML/HTML5 (named entity) have precedence over HTML401 (numeric entity). When $flags is not provided the default is COMPAT | HTML401, and the documentation is correct. When $flags is provided, > When neither of ENT_COMPAT, ENT_QUOTES, ENT_NOQUOTES is present, the default is ENT_COMPAT. Incorrect. Default is ENT_NOQUOTES. > When more than one of ENT_COMPAT, ENT_QUOTES, ENT_NOQUOTES is present, ENT_QUOTES takes the highest precedence, > followed by ENT_COMPAT. Correct. > When neither of ENT_HTML401, ENT_HTML5, ENT_XHTML, ENT_XML1 is present, the default is ENT_HTML401. Correct. > When more than one of ENT_HTML401, ENT_HTML5, ENT_XHTML, ENT_XML1 is present, ENT_HTML5 takes the highest > precedence, followed by ENT_XHTML, ENT_HTML401. Incorrect. ENT_XML1 is not included in this list, suggesting it has lowest precedence and HTML401 is higher, however HTML401 is actually the lowest precedence. This may be just a simple accidental omission as XHTML/XML1 are very closely related. Note that as far as htmlspecialchars is concerned, HTML5/XHTML/XML1 are equivalent as they all encode apostrophes the same way.