php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #47876 Wrong mb_detect_encoding() with string "chr᚝any"
Submitted: 2009-04-02 10:08 UTC Modified: 2009-04-21 01:00 UTC
Votes:5
Avg. Score:3.6 ± 0.8
Reproduced:5 of 5 (100.0%)
Same Version:1 (20.0%)
Same OS:1 (20.0%)
From: FrancS at seznam dot cz Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 5.2.9 OS: Windows XP
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: FrancS at seznam dot cz
New email:
PHP Version: OS:

 

 [2009-04-02 10:08 UTC] FrancS at seznam dot cz
Description:
------------
Hi,

today I discover a problem with mb function mb_detect_encoding().

I have a string "chr᚝any" in czech language. It seems that this function everytime return UTF-8 encoding, even if I load the text from a file with encoding "windows-1250" or "ISO-8859-2".


Reproduce code:
---------------
// test.txt is text file with charset "windows-1250" or "ISO-8859-2"

$string = file_get_contents('test.txt');

var_dump(mb_detect_encoding($string, mb_list_encodings(), true));

Expected result:
----------------
SJIS

Actual result:
--------------
utf-8

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-04-02 10:13 UTC] FrancS at seznam dot cz
I look again on it and problem is with "᚝" in word with no other accent chars in it.
 [2009-04-13 17:58 UTC] jani@php.net
What if you pass the function the possible encodings and never "auto" 
which always has UTF-8 as first. Something like like this:

echo mb_detect_encoding($str, "SJIS, sjis-win");

 [2009-04-14 08:25 UTC] FrancS at seznam dot cz
It is because I want to use it for finding which encoding I have in input string. It is posible that some user send some data in one of these encoding....utf-8, windows-1250 and ISO-8859-2.

It is important to me to find in which encoding it is in.
 [2009-04-21 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 14:01:28 2024 UTC