php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #47151 PHP 6.0 decodes incorrect base64 uft8 data
Submitted: 2009-01-19 17:53 UTC Modified: 2009-01-19 22:47 UTC
From: lunter at interia dot pl Assigned:
Status: Closed Package: Unicode Engine related
PHP Version: 6CVS-2009-01-19 (CVS) OS: all
Private report: No CVE-ID: None
 [2009-01-19 17:53 UTC] lunter at interia dot pl
Description:
------------
Problem:
--------

PHP 6.0 decodes incorrect base64 uft8 data.

If it is bogus, show the way to encode 'zrEgKyDOsiA9IM6z' to (unicode)string 'α + β = γ'.

--------------------------------------------------------------------------------------------

PHP 6.0 example (example.php):
------------------------------

<?
// unicode.semantics = off
// unicode.runtime_encoding = iso-8859-1
// unicode.script_encoding = utf-8
// unicode.output_encoding = utf-8
// unicode.from_error_mode = U_INVALID_SUBSTITUTE
// unicode.from_error_subst_char = 3f


 $base64='zrEgKyDOsiA9IM6z';				// this is utf-8 based64 text

 $binary=base64_decode($base64);			// binary utf-8 bytes

 $text=unicode_decode($binary,'iso-8859-1');		// why iso-8859-x is only supported, where is raw binary option ?
// $text=bin2uni($binary);				// needed

 header('Content-Type: text/plain; charset=utf-8');
 print($text);						// SHOULD BE (utf-8): α + β = γ
?>

--------------------------------------------------------------------------------------------

Solution:
---------

We can not get (unicode)string from (binary)string consists utf-8 bytes.
Imagine: converting (unicode)<->(binary unicode bytes string) newer need charset infomation.

C#: System.Text.Encoding.UTF8.GetString()
Decodes a sequence of bytes from the specified byte array into a string.

PHP equivalent needed: unicode bin2uni( binary $b )
Decodes a sequence of bytes from the specified binary string into an unicode string.

--------------------------------------------------------------------------------------------

C# working equivalent (example.ashx):
-------------------------------------

<%@ WebHandler Language="C#" Class="example_handler" %>

using System;
using System.Data;
using System.Web;

public class example_handler : IHttpHandler {
    
    public void ProcessRequest (HttpContext context) {




        string base64 = "zrEgKyDOsiA9IM6z";				// this is utf-8 based64 text

        byte[] binary = Convert.FromBase64String(base64);		// binary utf-8 bytes

        string text = System.Text.Encoding.UTF8.GetString(binary);	// raw binary supported

        context.Response.ContentType = "text/plain; charset=utf-8";
        context.Response.Write(text);					// very good (utf-8): α + β = γ




    }
 
    public bool IsReusable {
        get {
            return false;
        }
    }

}







Reproduce code:
---------------
above

Expected result:
----------------
above

Actual result:
--------------
above

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-01-19 22:47 UTC] lunter at interia dot pl
new version, better examples
http://bugs.php.net/bug.php?id=47155
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 14:01:32 2024 UTC