php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #69217 mb_convert_encoding() passes 0x80 as valid ASCII and ISO-8859-1 code
Submitted: 2015-03-11 03:27 UTC Modified: 2015-03-28 20:51 UTC
From: salsi at icosaedro dot it Assigned:
Status: Open Package: mbstring related
PHP Version: Irrelevant OS: Slackware 14.1, Windows VISTA
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: salsi at icosaedro dot it
New email:
PHP Version: OS:

 

 [2015-03-11 03:27 UTC] salsi at icosaedro dot it
Description:
------------
mb_convert_encoding() does not recognize invalid bytes from ill formed ASCII and ISO-8859-1 encoded binary string. It seems that this 0x80 byte be blindly converted to its corresponding 2-bytes UTF-8 sequence 0xC2 0x80. Instead, converting from UTF-8 in itself the error is detected.

Although not stated anywhere, mb_convert_encoding() should guarantee that either the resulting output string be perfectly compliant with the requested expected encoding, or it must fail with error. Returning unexpected results may have safety and security consequences.

Tested on:
- PHP 5.7.0-dev Slackware Linux 14.1
- PHP 5.5.16 Windows VISTA


Test script:
---------------
<?php
error_reporting(PHP_INT_MAX);
ini_set("mbstring.substitute_character", (string) ord("X"));
ini_set("mbstring.strict_detection", "1"); // no effect
// bytes 0x80-0xff are not valid ASCII:
echo 1, rawurlencode(mb_convert_encoding("\x80", "UTF-8", "ASCII")) . "\n";
// bytes 0x80-0x9f are not valid ISO-8859-1:
echo 2, rawurlencode(mb_convert_encoding("\x80", "UTF-8", "ISO-8859-1")) . "\n";
echo 3, rawurlencode(mb_convert_encoding("\x80", "UTF-8", "UTF-8")) . "\n";


Expected result:
----------------
1X
2X
3X


Actual result:
--------------
1%C2%80
2%C2%80
3X


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-03-28 20:51 UTC] yohgaki@php.net
-Summary: b_convert_encoding() passes 0x80 as valid ASCII and ISO-8859-1 code +Summary: mb_convert_encoding() passes 0x80 as valid ASCII and ISO-8859-1 code
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Fri May 29 18:01:25 2020 UTC