php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #44703 htmlspecialchars() does not detect bad character set argument
Submitted: 2008-04-11 18:30 UTC Modified: 2008-04-11 19:02 UTC
From: wharmby at uk dot ibm dot com Assigned:
Status: Closed Package: Scripting Engine problem
PHP Version: 5.2.6RC5 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: wharmby at uk dot ibm dot com
New email:
PHP Version: OS:

 

 [2008-04-11 18:30 UTC] wharmby at uk dot ibm dot com
Description:
------------
htmlspecialchars() does not always detect bad character set argument.

Problem in the following code around line 850 of ext/standard/html.c:

det_charset:

	if (charset_hint) {
	   int found = 0;
		
	    /* now walk the charset map and look for the codeset */
		for (i = 0; charset_map[i].codeset; i++) {
			if (strncasecmp(charset_hint, charset_map[i].codeset, len) == 0) {
		           charset = charset_map[i].charset;
			   found = 1;
			   break;
			}
		}

This uses "len" as the maximum comparison length which is the length 
of the input charset hint. If this happens to match the first few 
characters of a VALID charset then the code fails to detect a bad 
charset. For example  a charset_hint of "125" is allowed as it matches
the first 3 characters of a valid charset; namely "1252".

If code is changed as follows to check the length as are equal first 
then the problem is resolved. 

		for (i = 0; charset_map[i].codeset; i++) {
			if (len == strlen(charset_map[i].codeset) && strncasecmp(charset_hint, charset_map[i].codeset, len) == 0) {
				charset = charset_map[i].charset;
				found = 1;
				break;
			}
		}

Reproduce code:
---------------
<?php
var_dump( htmlspecialchars("<a href='test'>Test</a>", ENT_COMPAT, 1) );
var_dump( htmlspecialchars("<a href='test'>Test</a>", ENT_COMPAT, 12) );
var_dump( htmlspecialchars("<a href='test'>Test</a>", ENT_COMPAT, 125) );
var_dump( htmlspecialchars("<a href='test'>Test</a>", ENT_COMPAT, 1252) );
var_dump( htmlspecialchars("<a href='test'>Test</a>", ENT_COMPAT, 12526) );

?>
===Done===

Expected result:
----------------
PHP Warning:  htmlspecialchars(): charset `1' not supported, assuming iso-8859-1 in  <path to t/c> 
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
PHP Warning:  htmlspecialchars(): charset `12' not supported, assuming iso-8859-1 in <path to t/c> 
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
PHP Warning:  htmlspecialchars(): charset `125' not supported, assuming iso-8859-1 in <path to t/c> 
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
PHP Warning:  htmlspecialchars(): charset `12526' not supported, assuming iso-8859-1 in <path to t/c> 
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
===Done===

Actual result:
--------------
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
PHP Warning:  htmlspecialchars(): charset `12526' not supported, assuming iso-8859-1 in  <path to t/c> 
string(35) "&lt;a href='test'&gt;Test&lt;/a&gt;"
===Done===

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-04-11 19:02 UTC] felipe@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 12:01:29 2024 UTC