php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #21616 encoding error in attributes with result_dump_*
Submitted: 2003-01-13 07:59 UTC Modified: 2003-01-16 00:50 UTC
From: anton at jclub dot ru Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 4.3.0 OS: linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: anton at jclub dot ru
New email:
PHP Version: OS:

 

 [2003-01-13 07:59 UTC] anton at jclub dot ru
encoding error in attributes with result_dump_* when <xsl:output method="html" encoding="windows-1251"/>

characters in win-1251 encoding presented in href, src, .. html attributes encode into %HEX sequence incorrectly


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-01-15 02:17 UTC] chregu@php.net
Please provide the shortest possible example for reproducing your error.

thanks

chregu
 [2003-01-15 05:06 UTC] anton at jclub dot ru
xslt:

<xsl:output method="html" encoding="windows-1251" omit-xml-declaration="yes"/>
..
<a href="??????">??????</a>
..

//contents inside <a> tag and href attribute in win-1251 encoding

output with result_dump_mem():

<a href="%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82">??????</a>

mustbe:

<a href="%EF%F0%E8%E2%E5%F2">??????</a>

it seems that contents inside href encodes as unicode, not as win-1251 or something like that.
 [2003-01-16 00:50 UTC] chregu@php.net
This is not a bug, but expected behaviour:

From http://www.w3c.org/TR/xslt#section-HTML-Output-Method
"The html output method should escape non-ASCII characters in URI attribute values using the method recommended in Section B.2.1 of the HTML 4.0 Recommendation."

And http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1
"We recommend that user agents adopt the following convention for handling non-ASCII characters in such cases:

   1. Represent each character in UTF-8 (see [RFC2279]) as one or more bytes.
   2. Escape these bytes with the URI escaping mechanism (i.e., by converting each byte to %HH, where HH is the hexadecimal notation of the byte value)."

chregu


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Apr 20 05:01:27 2024 UTC