|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2007-11-14 14:39 UTC] tallyce at gmail dot com
Description:
------------
A string which includes the ? dagger symbol that is processed with htmlentities() with UTF-8 as the encoding results in the whole string being discarded and appearing as blank.
This is definitely a change in PHP 5.2.5. Tested on both Windows and Linux machines.
Reproduce code:
---------------
<?php echo htmlentities ('Test ?', ENT_COMPAT, 'UTF-8') . '<br />' . htmlentities ('Test', ENT_COMPAT, 'UTF-8'); ?>
Expected result:
----------------
Test ?
Test
[This is indeed the result as expected, on PHP v.5.2.4]
Actual result:
--------------
Test
[Blank line at start]
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Nov 07 13:00:01 2025 UTC |
I've been spending further time trying to work out what's happening, and am convinced something is definitely not right. I've also found another character where the presence of the character results in the whole string disappearing, and there may be others. Using this reproduce code: <?php echo htmlentities ('Test ? ?', ENT_COMPAT, 'UTF-8') . '<br />' . preg_replace('/[^\x00-\x7F]/e', '"&#".ord("$0").";"', 'Test ? ?') . '<br />' . htmlentities ('Test', ENT_COMPAT, 'UTF-8') . '<br />'; ?> I get different results for machines running SUSE Linux/PHP5.2.4, Linux Ubuntu/PHP 5.2.3 and WinXP/PHP 5.2.5. Only the second gives the result I would expect. 1. From a linux machine terminal: Firstly doing less t.php gives <?php echo htmlentities ('Test 233 206', ENT_COMPAT, 'UTF-8') . '<br />' . preg_replace('/[^\x00-\x7F]/e', '"&#".ord("$0").";"', 'Test 233 206') . '< br />' . htmlentities ('Test', ENT_COMPAT, 'UTF-8') . '<br />'; ?> with the 233 and 206 background-highlighted. php -v PHP 5.2.4 (cli) (built: Sep 12 2007 15:23:24) Copyright (c) 1997-2007 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies Test <br />Test › †<br />Test<br /> 2. From the same machine but viewing with a web browser (FF2.0.0.11/WinXP), i.e. example.com/t.php (which is serving up UTF-8 pages as confirmed by web-sniffer.net): Test ? ?<br />Test › †<br />Test<br /> [two symbols appear as ? in diamond] 3. On another machine, with the putty terminal set to UTF-8: less t.php gives: <?php echo htmlentities ('Test ? ?', ENT_COMPAT, 'UTF-8') . '<br />' . preg_replace('/[^\x00-\x7F]/e', '"&#".ord("$0").";"', 'Test ? ?') . '<br />' . htmlentities ('Test', ENT_COMPAT, 'UTF-8') . '<br />'; ?> exactly as first entered. php -v PHP 5.2.3-1ubuntu6.2 (cli) (built: Dec 3 2007 19:59:42) Copyright (c) 1997-2007 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies php t.php Test › †<br />Test › †<br />Test<br /> 4. Same machine as (3) but via web browser: Test › †<br />Test › †<br />Test<br /> 5. On a Windows machine C:\Documents and Settings\username>php -v PHP 5.2.5 (cli) (built: Nov 8 2007 23:18:51) Copyright (c) 1997-2007 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies H:\>php t.php PHP Warning: htmlentities(): Invalid multibyte sequence in argument in H:\t.php on line 1 <br />Test › †<br />Test<br /> 6. Same machine as (5) but via web browser <br />Test › †<br />Test<br />