php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71183 mb_detect_encoding wrong with Windows-1252
Submitted: 2015-12-21 18:28 UTC Modified: 2015-12-21 19:14 UTC
From: martin at sutunam dot com Assigned:
Status: Not a bug Package: mbstring related
PHP Version: 5.5.30 OS: Ubuntu
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: martin at sutunam dot com
New email:
PHP Version: OS:

 

 [2015-12-21 18:28 UTC] martin at sutunam dot com
Description:
------------
mb_detect_encoding doesn't detect string with Windows-1252 ( cp1252 ) encoding. 

The function is supposed to return the first compatible encoding within the provided encoding list as second param.

Even with an ASCII string like 'aaa', it refuse the Windows-1252 encoding.

Test script:
---------------
<?php
$enc = 'Windows-1252';
if (!in_array($enc, mb_list_encodings())) {
    die($enc.' not supported');
}

$val = '100 '. 0x80; // € sign in cp1252
echo mb_detect_encoding($val, $enc.',ISO-8859-1', true).PHP_EOL;

$val = 'aaa';
echo mb_detect_encoding($val, $enc.',ISO-8859-1', true).PHP_EOL;

Expected result:
----------------
Windows-1252
Windows-1252

Actual result:
--------------
ISO-8859-1
ISO-8859-1

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-12-21 19:14 UTC] ab@php.net
-Status: Open +Status: Not a bug
 [2015-12-21 19:14 UTC] ab@php.net
Thanks for the report. 

Firstly, with 0x80 you append a number, not a char. Use chr(0x80).

Secondly, a longer string could have more success. It is in general hard to determine some single byte encoding, and it's almost impossible with passing just 5 byte which all are numbers and white space. It can even fail sometime for multibyte encoding. It is unlikely related to a buggy behavior in any way.

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 14:01:32 2024 UTC