php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #47798 html_entity_decode() not covering Z/z with caron for Windows-1252
Submitted: 2009-03-27 01:05 UTC Modified: 2010-10-11 02:33 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: f4ckm5 at web dot de Assigned: cataphract (profile)
Status: Closed Package: Strings related
PHP Version: 5.2.9 OS: Windows
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: f4ckm5 at web dot de
New email:
PHP Version: OS:

 

 [2009-03-27 01:05 UTC] f4ckm5 at web dot de
Description:
------------
html_entity_decode should decode "LATIN CAPITAL LETTER Z WITH CARON" Ž Ž Ž to (int)142, hex(8E) for Windows-1252
html_entity_decode should decode "LATIN SMALL LETTER Z WITH CARON" ž ž ž to (int)158, hex(9E) for Windows-1252

htmlentities should encode the respective characters to Ž and ž (ž and Ž is not well supported by most browsers)

Reproduce code:
---------------
var_dump(html_entity_decode("Ž", ENT_QUOTES, "Windows-1252"));
var_dump(html_entity_decode("Ž", ENT_QUOTES, "Windows-1252"));
var_dump(html_entity_decode("Ž", ENT_QUOTES, "Windows-1252"));
var_dump(html_entity_decode("ž", ENT_QUOTES, "Windows-1252"));
var_dump(html_entity_decode("ž", ENT_QUOTES, "Windows-1252"));
var_dump(html_entity_decode("ž", ENT_QUOTES, "Windows-1252"));
var_dump(htmlentities(chr(142), ENT_QUOTES, "Windows-1252", true));
var_dump(htmlentities(chr(158), ENT_QUOTES, "Windows-1252", true));

Expected result:
----------------
string(1) "?"
string(1) "?"
string(1) "?"
string(1) "?"
string(1) "?"
string(1) "?"
string(6) "Ž"
string(6) "ž"

Actual result:
--------------
string(8) "Ž"
string(6) "Ž"
string(7) "Ž"
string(8) "ž"
string(6) "ž"
string(7) "ž"
string(1) "?"
string(1) "?"


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-10-11 02:33 UTC] cataphract@php.net
-Status: Open +Status: Closed -Package: Feature/Change Request +Package: *General Issues -Assigned To: +Assigned To: cataphract
 [2010-10-11 02:33 UTC] cataphract@php.net
This has been partially fixed in trunk. The result is:

string(8) "Ž"
string(1) "�"
string(1) "�"
string(8) "ž"
string(1) "�"
string(1) "�"
string(1) "�"
string(1) "�" 

ž and Ž are not supported because I can't find those in the entities for HTML or XHTML:

See http://www.w3.org/TR/2002/REC-xhtml1-20020801/dtds.html#h-A2 (XHTML)

HTML 4.01:
http://www.w3.org/TR/html4/HTMLlat1.ent
http://www.w3.org/TR/html4/HTMLsymbol.ent
http://www.w3.org/TR/html4/HTMLspecial.ent
 [2010-10-11 02:33 UTC] cataphract@php.net
-Package: *General Issues +Package: Strings related
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Feb 05 18:01:34 2025 UTC