php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37724 mb_detect_encoding returns wrong result when text contains a trailing accent
Submitted: 2006-06-07 08:54 UTC Modified: 2007-08-17 23:02 UTC
Votes:2
Avg. Score:4.0 ± 1.0
Reproduced:1 of 2 (50.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: oylbqelmhfbxzg at mailinator dot com Assigned: hirokawa (profile)
Status: Closed Package: mbstring related
PHP Version: 4.4.2 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: oylbqelmhfbxzg at mailinator dot com
New email:
PHP Version: OS:

 

 [2006-06-07 08:54 UTC] oylbqelmhfbxzg at mailinator dot com
Description:
------------
Since bug 36994 was closed..

Both
 $string = "test?"
in a utf-8 text file, and 
 $string = "test?"
in an iso-8859-1 file (converted using iconv) return "UTF-8" with mb_detect_encoding, even when strict is on.


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-07-23 12:05 UTC] sniper@php.net
Rui, does this exist also in PHP 5.x branches/HEAD?
If so and this is real bug, update the version. :)
 [2006-09-14 22:48 UTC] hirokawa@php.net
Could you show me mbstring part of php.ini ?
And please show me the simple script to verify your ploblem.
I executed this tiny script, and if forks fine. 
(with Fedora Linux 5, PHP 5.1.5)
<?php
$string = "test?";
echo mb_detect_encoding($string,array('ISO-8859-1','UTF-8'));
// returns ISO-8859-1
?>

 [2007-02-21 11:28 UTC] gabriel at unisolution dot de
I read that 

mb_detect_encoding($string,array('ISO-8859-1','UTF-8'));

always return ISO-8859-1.

Try this;
<?php
$string1 = "test?";
echo mb_detect_encoding($string1,'UTF-8, ISO-8859-1');
// returns UTF-8
echo " ";
$string2 = "test?e";
echo mb_detect_encoding($string2,'UTF-8, ISO-8859-1');
// returns ISO-8859-1
?>
(php 4.4.0)
 [2007-08-17 22:32 UTC] hirokawa@php.net
It is also happend in PHP 5.2.
 [2007-08-17 23:02 UTC] hirokawa@php.net
Thank you for your bug report. This issue has already been fixed
in the latest released version of PHP, which you can download at 
http://www.php.net/downloads.php

If strict mode detection is used in PHP 5.2.3,
both results are ISO-8859-1.

<?php
$string1 = "test?";
echo mb_detect_encoding($string1,'UTF-8, ISO-8859-1', true);
// returns UTF-8
echo " ";
$string2 = "test?e";
echo mb_detect_encoding($string2,'UTF-8, ISO-8859-1', true);
// returns ISO-8859-1
?>

 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Dec 26 15:01:32 2024 UTC