php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75236 infinite loop when printing an error-message
Submitted: 2017-09-20 16:27 UTC Modified: 2017-09-21 11:12 UTC
From: lzsiga at freemail dot c3 dot hu Assigned: ajf (profile)
Status: Closed Package: Reproducible crash
PHP Version: 7.1.9 OS: AIX, Linux
Private report: No CVE-ID: None
 [2017-09-20 16:27 UTC] lzsiga at freemail dot c3 dot hu
Description:
------------
Printing an error message with settings 'html_errors=true' and 'default_charset=ISO-8859-2' leads to infinite loop.

Components of the loop: 
determine_charset -> php_error_docref0 -> php_verror -> php_escape_html_entities -> php_escape_html_entities_ex -> determine_charset

This bug was introduced in version 7.1.9, main/main.c line 765

before: replace_buffer = php_escape_html_entities((unsigned char*)buffer, buffer_len, 0, ENT_COMPAT, NULL);
after:  php_escape_html_entities((unsigned char*)buffer, buffer_len, 0, ENT_COMPAT, SG(default_charset));

(A note: The problem could be solved if htmlentities didn't verify UTF8-validity and silently ignored 'charset' parameter. See also: https://bugs.php.net/bug.php?id=47494  )

Test script:
---------------
#!/usr/local/bin/php
<?php

    ini_set('html_errors', true);
    ini_set('default_charset', 'ISO-8859-2');

    printf ("before getfilecontent\n");
    file_get_contents ('no/suchfile');
    printf ("after getfilecontent\n");

?>


Expected result:
----------------
before getfilecontent
PHP Warning:  file_get_contents(no/suchfile): failed to open stream: No such file or directory in /local/home/projects/devel/phptest/loopy.php on line 8
<br />
<b>Warning</b>:  file_get_contents(no/suchfile): failed to open stream: No such file or directory in <b>/local/home/projects/devel/phptest/loopy.php</b> on line <b>8</b><br />
after getfilecontent


Actual result:
--------------
before getfilecontent
Segmentation fault


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-09-20 17:01 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2017-09-20 17:01 UTC] cmb@php.net
I can confirm this issue. It happens for all unsupported "charsets", i.e.
for those that htmlentities() would throw a respective warning. See
<https://github.com/php/php-src/blob/php-7.1.9/ext/standard/html.c#L448-L467>.

> The problem could be solved if htmlentities didn't verify UTF8-validity and
> silently ignored 'charset' parameter.

That would cause other issues, though.
 [2017-09-20 20:23 UTC] lzsiga at freemail dot c3 dot hu
Well, yes, my fault; what I should have suggested is that instead of 'htmlentities' 'htmlspecialchars' should be called, with hardcoded 'charset='ISO-8859-1' (or 'ASCII' if there is such an option -- htmlspecialchars only deals with ASCII characters like < > & ' " )
 [2017-09-20 22:00 UTC] cmb@php.net
-Assigned To: +Assigned To: ajf
 [2017-09-20 22:00 UTC] cmb@php.net
> instead of 'htmlentities' 'htmlspecialchars' should be called

That can cause issues if the string is encoded differently than what the output
expects.

> htmlspecialchars only deals with ASCII characters like < > & ' "

Applying htmlspecialchars() on an arbitrary encoding assuming it would be
ASCII compatible causes issues. Consider an UTF-16 encoded ļ (U+0131).

Anyhow, the behavioral change introduced by fixing bug #74725 appears to be a
severe bug, since htmlspecialchars() segfaults due to infinite recursion for any
unsupported default_charset, if html_errors is enabled:

    <?php
    ini_set('html_errors', true);
    ini_set('default_charset', 'ISO-8859-2');
    htmlentities('foo', ENT_COMPAT, 'ISO-8859-2');

A possible solution might be to temporarily change the default_charset to UTF-8
while calling php_error_docref()[1], and to remove charset_hint from the
message, to ensure that we're really dealing with UTF-8 (actually, ASCII) here.

Andrea, could you please have a look at this issue?

[1] <https://github.com/php/php-src/blob/php-7.1.9/ext/standard/html.c#L463-L464>
 [2017-09-20 22:18 UTC] ajf@php.net
Argh, I'm sorry about this, I probably should have tested that fix more than I did.

I'm looking into this now.
 [2017-09-20 23:06 UTC] ajf@php.net
Automatic comment on behalf of ajf@ajf.me
Revision: http://git.php.net/?p=php-src.git;a=commit;h=418f97443aa44644bdf81b96fb726518754724f5
Log: Fix bug #75236
 [2017-09-20 23:06 UTC] ajf@php.net
-Status: Verified +Status: Closed
 [2017-09-21 09:00 UTC] lzsiga at freemail dot c3 dot hu
Hi, thank you all for the quick fix!

Speaking of character-encodings, I'd like to ask something: does PHP support any character-seta that aren't ASCII-compatible (i.e.: there might be a byte between 0x00 and 0x7f that isn't ASCII-code, but a part of a sequence), like UCS2, UTF-16 or UTF-16? (I really hope the answer is a 'no';)
 [2017-09-21 10:00 UTC] cmb@php.net
PHP strings are unaware of the character encoding, so basically any character
 encoding is supported. How these are handled exactly depends on the respective
 functions/methods.  For instance, the character ļ (U+013C, of course!) is not
 handled well when given in UTF-16BE encoding to htmlentities():
 
    <?php
    var_dump(htmlentities("\x01\x3c", ENT_COMPAT, 'UTF-16BE'));
    ?>

outputs something like:

    Warning: htmlentities(): charset `UTF-16BE' not supported, assuming utf-8
    string(5) "&lt;"

If the $encoding parameter is not set, or simply wrong, the result may be
identical, but there may not even be a warning.

It is the developers responsibility to properly deal with character encodings.
 [2017-09-21 10:28 UTC] lzsiga at freemail dot c3 dot hu
Thank you for your answer, so here is my next question: should I find out how to make ISO-8859-2 to be supported, could it be merged into PHP?
 [2017-09-21 11:12 UTC] cmb@php.net
Adding htmlentities() support for ISO-8859-2 shouldn't be hard[1], but I'm not
sure whether that would be welcome. I suggest to ask on internals@lists.php.net
first.

[1] <https://lxr.room11.org/xref/php-src%40master/ext/standard/html_tables.h#44>
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 22 19:01:31 2025 UTC