php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #47151 PHP 6.0 decodes incorrect base64 uft8 data
Submitted: 2009-01-19 17:53 UTC Modified: 2009-01-19 22:47 UTC
From: lunter at interia dot pl Assigned:
Status: Closed Package: Unicode Engine related
PHP Version: 6CVS-2009-01-19 (CVS) OS: all
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: lunter at interia dot pl
New email:
PHP Version: OS:

 

 [2009-01-19 17:53 UTC] lunter at interia dot pl
Description:
------------
Problem:
--------

PHP 6.0 decodes incorrect base64 uft8 data.

If it is bogus, show the way to encode 'zrEgKyDOsiA9IM6z' to (unicode)string 'α + β = γ'.

--------------------------------------------------------------------------------------------

PHP 6.0 example (example.php):
------------------------------

<?
// unicode.semantics = off
// unicode.runtime_encoding = iso-8859-1
// unicode.script_encoding = utf-8
// unicode.output_encoding = utf-8
// unicode.from_error_mode = U_INVALID_SUBSTITUTE
// unicode.from_error_subst_char = 3f


 $base64='zrEgKyDOsiA9IM6z';				// this is utf-8 based64 text

 $binary=base64_decode($base64);			// binary utf-8 bytes

 $text=unicode_decode($binary,'iso-8859-1');		// why iso-8859-x is only supported, where is raw binary option ?
// $text=bin2uni($binary);				// needed

 header('Content-Type: text/plain; charset=utf-8');
 print($text);						// SHOULD BE (utf-8): α + β = γ
?>

--------------------------------------------------------------------------------------------

Solution:
---------

We can not get (unicode)string from (binary)string consists utf-8 bytes.
Imagine: converting (unicode)<->(binary unicode bytes string) newer need charset infomation.

C#: System.Text.Encoding.UTF8.GetString()
Decodes a sequence of bytes from the specified byte array into a string.

PHP equivalent needed: unicode bin2uni( binary $b )
Decodes a sequence of bytes from the specified binary string into an unicode string.

--------------------------------------------------------------------------------------------

C# working equivalent (example.ashx):
-------------------------------------

<%@ WebHandler Language="C#" Class="example_handler" %>

using System;
using System.Data;
using System.Web;

public class example_handler : IHttpHandler {
    
    public void ProcessRequest (HttpContext context) {




        string base64 = "zrEgKyDOsiA9IM6z";				// this is utf-8 based64 text

        byte[] binary = Convert.FromBase64String(base64);		// binary utf-8 bytes

        string text = System.Text.Encoding.UTF8.GetString(binary);	// raw binary supported

        context.Response.ContentType = "text/plain; charset=utf-8";
        context.Response.Write(text);					// very good (utf-8): α + β = γ




    }
 
    public bool IsReusable {
        get {
            return false;
        }
    }

}







Reproduce code:
---------------
above

Expected result:
----------------
above

Actual result:
--------------
above

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-01-19 22:47 UTC] lunter at interia dot pl
new version, better examples
http://bugs.php.net/bug.php?id=47155
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 18:01:29 2024 UTC