php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #21616 encoding error in attributes with result_dump_*
Submitted: 2003-01-13 07:59 UTC Modified: 2003-01-16 00:50 UTC
From: anton at jclub dot ru Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 4.3.0 OS: linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: anton at jclub dot ru
New email:
PHP Version: OS:

 

 [2003-01-13 07:59 UTC] anton at jclub dot ru
encoding error in attributes with result_dump_* when <xsl:output method="html" encoding="windows-1251"/>

characters in win-1251 encoding presented in href, src, .. html attributes encode into %HEX sequence incorrectly


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-01-15 02:17 UTC] chregu@php.net
Please provide the shortest possible example for reproducing your error.

thanks

chregu
 [2003-01-15 05:06 UTC] anton at jclub dot ru
xslt:

<xsl:output method="html" encoding="windows-1251" omit-xml-declaration="yes"/>
..
<a href="??????">??????</a>
..

//contents inside <a> tag and href attribute in win-1251 encoding

output with result_dump_mem():

<a href="%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82">??????</a>

mustbe:

<a href="%EF%F0%E8%E2%E5%F2">??????</a>

it seems that contents inside href encodes as unicode, not as win-1251 or something like that.
 [2003-01-16 00:50 UTC] chregu@php.net
This is not a bug, but expected behaviour:

From http://www.w3c.org/TR/xslt#section-HTML-Output-Method
"The html output method should escape non-ASCII characters in URI attribute values using the method recommended in Section B.2.1 of the HTML 4.0 Recommendation."

And http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1
"We recommend that user agents adopt the following convention for handling non-ASCII characters in such cases:

   1. Represent each character in UTF-8 (see [RFC2279]) as one or more bytes.
   2. Escape these bytes with the URI escaping mechanism (i.e., by converting each byte to %HH, where HH is the hexadecimal notation of the byte value)."

chregu


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC