php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75236 infinite loop when printing an error-message
Submitted: 2017-09-20 16:27 UTC Modified: 2017-09-21 11:12 UTC
From: lzsiga at freemail dot c3 dot hu Assigned: ajf (profile)
Status: Closed Package: Reproducible crash
PHP Version: 7.1.9 OS: AIX, Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: lzsiga at freemail dot c3 dot hu
New email:
PHP Version: OS:

 

 [2017-09-20 16:27 UTC] lzsiga at freemail dot c3 dot hu
Description:
------------
Printing an error message with settings 'html_errors=true' and 'default_charset=ISO-8859-2' leads to infinite loop.

Components of the loop: 
determine_charset -> php_error_docref0 -> php_verror -> php_escape_html_entities -> php_escape_html_entities_ex -> determine_charset

This bug was introduced in version 7.1.9, main/main.c line 765

before: replace_buffer = php_escape_html_entities((unsigned char*)buffer, buffer_len, 0, ENT_COMPAT, NULL);
after:  php_escape_html_entities((unsigned char*)buffer, buffer_len, 0, ENT_COMPAT, SG(default_charset));

(A note: The problem could be solved if htmlentities didn't verify UTF8-validity and silently ignored 'charset' parameter. See also: https://bugs.php.net/bug.php?id=47494  )

Test script:
---------------
#!/usr/local/bin/php
<?php

    ini_set('html_errors', true);
    ini_set('default_charset', 'ISO-8859-2');

    printf ("before getfilecontent\n");
    file_get_contents ('no/suchfile');
    printf ("after getfilecontent\n");

?>


Expected result:
----------------
before getfilecontent
PHP Warning:  file_get_contents(no/suchfile): failed to open stream: No such file or directory in /local/home/projects/devel/phptest/loopy.php on line 8
<br />
<b>Warning</b>:  file_get_contents(no/suchfile): failed to open stream: No such file or directory in <b>/local/home/projects/devel/phptest/loopy.php</b> on line <b>8</b><br />
after getfilecontent


Actual result:
--------------
before getfilecontent
Segmentation fault


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-09-20 17:01 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2017-09-20 17:01 UTC] cmb@php.net
I can confirm this issue. It happens for all unsupported "charsets", i.e.
for those that htmlentities() would throw a respective warning. See
<https://github.com/php/php-src/blob/php-7.1.9/ext/standard/html.c#L448-L467>.

> The problem could be solved if htmlentities didn't verify UTF8-validity and
> silently ignored 'charset' parameter.

That would cause other issues, though.
 [2017-09-20 20:23 UTC] lzsiga at freemail dot c3 dot hu
Well, yes, my fault; what I should have suggested is that instead of 'htmlentities' 'htmlspecialchars' should be called, with hardcoded 'charset='ISO-8859-1' (or 'ASCII' if there is such an option -- htmlspecialchars only deals with ASCII characters like < > & ' " )
 [2017-09-20 22:00 UTC] cmb@php.net
-Assigned To: +Assigned To: ajf
 [2017-09-20 22:00 UTC] cmb@php.net
> instead of 'htmlentities' 'htmlspecialchars' should be called

That can cause issues if the string is encoded differently than what the output
expects.

> htmlspecialchars only deals with ASCII characters like < > & ' "

Applying htmlspecialchars() on an arbitrary encoding assuming it would be
ASCII compatible causes issues. Consider an UTF-16 encoded ļ (U+0131).

Anyhow, the behavioral change introduced by fixing bug #74725 appears to be a
severe bug, since htmlspecialchars() segfaults due to infinite recursion for any
unsupported default_charset, if html_errors is enabled:

    <?php
    ini_set('html_errors', true);
    ini_set('default_charset', 'ISO-8859-2');
    htmlentities('foo', ENT_COMPAT, 'ISO-8859-2');

A possible solution might be to temporarily change the default_charset to UTF-8
while calling php_error_docref()[1], and to remove charset_hint from the
message, to ensure that we're really dealing with UTF-8 (actually, ASCII) here.

Andrea, could you please have a look at this issue?

[1] <https://github.com/php/php-src/blob/php-7.1.9/ext/standard/html.c#L463-L464>
 [2017-09-20 22:18 UTC] ajf@php.net
Argh, I'm sorry about this, I probably should have tested that fix more than I did.

I'm looking into this now.
 [2017-09-20 23:06 UTC] ajf@php.net
Automatic comment on behalf of ajf@ajf.me
Revision: http://git.php.net/?p=php-src.git;a=commit;h=418f97443aa44644bdf81b96fb726518754724f5
Log: Fix bug #75236
 [2017-09-20 23:06 UTC] ajf@php.net
-Status: Verified +Status: Closed
 [2017-09-21 09:00 UTC] lzsiga at freemail dot c3 dot hu
Hi, thank you all for the quick fix!

Speaking of character-encodings, I'd like to ask something: does PHP support any character-seta that aren't ASCII-compatible (i.e.: there might be a byte between 0x00 and 0x7f that isn't ASCII-code, but a part of a sequence), like UCS2, UTF-16 or UTF-16? (I really hope the answer is a 'no';)
 [2017-09-21 10:00 UTC] cmb@php.net
PHP strings are unaware of the character encoding, so basically any character
 encoding is supported. How these are handled exactly depends on the respective
 functions/methods.  For instance, the character ļ (U+013C, of course!) is not
 handled well when given in UTF-16BE encoding to htmlentities():
 
    <?php
    var_dump(htmlentities("\x01\x3c", ENT_COMPAT, 'UTF-16BE'));
    ?>

outputs something like:

    Warning: htmlentities(): charset `UTF-16BE' not supported, assuming utf-8
    string(5) "&lt;"

If the $encoding parameter is not set, or simply wrong, the result may be
identical, but there may not even be a warning.

It is the developers responsibility to properly deal with character encodings.
 [2017-09-21 10:28 UTC] lzsiga at freemail dot c3 dot hu
Thank you for your answer, so here is my next question: should I find out how to make ISO-8859-2 to be supported, could it be merged into PHP?
 [2017-09-21 11:12 UTC] cmb@php.net
Adding htmlentities() support for ISO-8859-2 shouldn't be hard[1], but I'm not
sure whether that would be welcome. I suggest to ask on internals@lists.php.net
first.

[1] <https://lxr.room11.org/xref/php-src%40master/ext/standard/html_tables.h#44>
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC