php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #61713 ext\standard\tests\strings\htmlentities10.phpt fails
Submitted: 2012-04-13 09:01 UTC Modified: 2012-05-08 17:39 UTC
From: ab@php.net Assigned: ab
Status: Closed Package: *General Issues
PHP Version: 5.3.10 OS: windows
Private report: No CVE-ID:
 [2012-04-13 09:01 UTC] ab@php.net
Description:
------------
Test diff:

002+ string(4) ",+TY"
003+ string(27) "?¢£¤¥"
002- string(28) "‚†™Ÿ"
003- string(32) "€¢£¤¥"

Expected result:
----------------
test pass

Actual result:
--------------
test fail

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-04-13 16:29 UTC] ab@php.net
Currently the php manual says some wrong things in http://de2.php.net/htmlentities about charset:

An empty string activates detection from script encoding (Zend multibyte), default_charset and current locale (see nl_langinfo() and setlocale()), in this order.

I can see only the following in the code of htmlentities:

* detection of the passed charset
* detection of the mbstring.internal_encoding
* detection of the current locales

The test relies on the default_charset value, which never affects htmlentities. This is the reason for bug 61714 and bug 61715 as well.
 [2012-04-13 17:25 UTC] ab@php.net
no, the previous was wrong ... default_charset gets read ... hm, looking next )
 [2012-05-08 15:37 UTC] ab@php.net
Consider the following lines in ext\standard\html.c

========== BEGIN ===========
        ZVAL_STRING(&nm_mb_internal_encoding, "mb_internal_encoding", 0);

        if (call_user_function_ex(CG(function_table), NULL, &nm_mb_internal_encoding, &uf_result, 0, NULL, 1, NULL TSRMLS_CC) != FAILURE) {

            charset_hint = Z_STRVAL_P(uf_result);
            len = Z_STRLEN_P(uf_result);

            if (len == 4) { /* sizeof(none|auto|pass)-1 */
                if (!memcmp("pass", charset_hint, sizeof("pass") - 1) ||
                    !memcmp("auto", charset_hint, sizeof("auto") - 1) ||
                    !memcmp("none", charset_hint, sizeof("none") - 1)) {

                    charset_hint = NULL;
                    len = 0;
                }
            }
            goto det_charset;
        }
    }
#endif
#endif

    charset_hint = SG(default_charset);
    if (charset_hint != NULL && (len=strlen(charset_hint)) != 0) {
        goto det_charset;
    }
========== END ===========

As you can see, mbstring.internal_encoding having pass resets charset_hint and jumps to det_charset ommiting sapi globals check ... this results iso-8859-1 being choosed. This all happens only with mbstring compiled as shared.

Fix follows.
 [2012-05-08 16:19 UTC] ab@php.net
Automatic comment on behalf of ab
Revision: http://git.php.net/?p=php-src.git;a=commit;h=3a4a25358fe3f389c434f68e59bfd70b25b93b29
Log: Fix bug #61713 ext\standard\tests\strings\htmlentities10.phpt fails
 [2012-05-08 17:39 UTC] ab@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: ab
 [2012-05-08 17:39 UTC] ab@php.net
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.

 For Windows:

http://windows.php.net/snapshots/
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2014 The PHP Group
All rights reserved.
Last updated: Wed Apr 23 17:01:58 2014 UTC