PHP :: Bug #27505 :: htmlentities fail to escape BIG5 characters correctly

Bug #27505	htmlentities fail to escape BIG5 characters correctly
Submitted:	2004-03-05 03:43 UTC	Modified:	2004-03-06 13:27 UTC
From:	ywliu at hotmail dot com	Assigned:
Status:	Closed	Package:	*Languages/Translation
PHP Version:	4.3.4	OS:	linux
Private report:	No	CVE-ID:	None

View Developer Edit

[2004-03-05 03:43 UTC] ywliu at hotmail dot com

Description:
------------
In ext/standard/html.c , htmlentities() fails to identify BIG5 Chinese characters correctly.

I have checked CVS version 1.87, the bug is still there.

Reproduce code:
---------------
In html.c, look for this piece of code :

case cs_big5:
case cs_gb2312:
case cs_big5hkscs:
    {
	/* check if this is the first of a 2-byte sequence */
	if (this_char >= 0xa1 && this_char <= 0xf9) {
	/* peek at the next char */
	unsigned char next_char = str[pos];
		if ((next_char >= 0x40 && next_char <= 0x73) ||(next_char >= 0xa1 && next_char <= 0xfe)) {
			

Expected result:
----------------
In fact, the first byte should be from 0xa1 to 0xfe, and the second byte should be from 0x40-0x7e and 0xa1-0xfe.

(from page 88, "Understanding Japanese Information Processing" by Ken Lunde , O'Reilly.)

Actual result:
--------------
So it should be :

	if (this_char >= 0xa1 && this_char <= 0xfe) {

and 

		if ((next_char >= 0x40 && next_char <= 0x7e) ||(next_char >= 0xa1 && next_char <= 0xfe)) {

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2004-03-06 13:27 UTC] iliaa@php.net

This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.

	php.net \| support \| documentation \| report a bug \| advanced search \| search howto \| statistics \| random bug \| login
go to bug id or search bugs for


Copyright © 2001-2026 The PHP Group All rights reserved.	Last updated: Wed Jun 24 09:00:02 2026 UTC