php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72752 mb_convert_encoding fails to convert chars from HTML-ENTITIES to some encodings
Submitted: 2016-08-03 23:45 UTC Modified: 2016-08-04 12:11 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: contact at amb dot tf Assigned:
Status: Verified Package: mbstring related
PHP Version: 7.0.9 OS: Arch Linux
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2016-08-03 23:45 UTC] contact at amb dot tf
Description:
------------
mb_convert_encoding fails to convert HTML-ENTITIES encoded chars to the following encodings:

* BASE64
* Quoted-Printable
* 7bit


Test script:
---------------
<?php

foreach ([
    'BASE64',
    'UUENCODE',
    'HTML-ENTITIES',
    'Quoted-Printable',
    '7bit',
    '8bit',
] as $encoding) {
    echo "$encoding: ", mb_convert_encoding('&#x61;', $encoding, 'HTML-ENTITIES'), PHP_EOL;
}


Expected result:
----------------
BASE64: YQ==
UUENCODE: a
HTML-ENTITIES: a
Quoted-Printable: a
7bit: a
8bit: a


Actual result:
--------------
BASE64: JiN4NjE7
UUENCODE: a
HTML-ENTITIES: a
Quoted-Printable: &#x61;
7bit: &#x61;
8bit: a


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-08-04 12:11 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2016-08-04 12:11 UTC] cmb@php.net
Confirmed: <https://3v4l.org/8u1h4>.

The culprit is in mbfl_convert_filter_get_vtbl()[1], where the
failing encodings simply assume the input is 8bit encoded. Just
removing these shortcuts(?) would make the supplied test script
pass (except for BASE64 encoding, which would report `a`).

[1] <https://github.com/php/php-src/blob/PHP-7.0.9/ext/mbstring/libmbfl/mbfl/mbfl_convert.c#L581-L589>
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 21:01:36 2024 UTC