php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #24218 mb_convert_encoding weirdness
Submitted: 2003-06-17 02:55 UTC Modified: 2003-06-26 18:22 UTC
From: mark at lange dot demon dot co dot uk Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 4.3.2 OS: Win32
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: mark at lange dot demon dot co dot uk
New email:
PHP Version: OS:

 

 [2003-06-17 02:55 UTC] mark at lange dot demon dot co dot uk
Description:
------------
A piece of code that was working perfectly correctly with PHP version 4.2.3 is now displaying erroneous characters for PHP 4.3.2.

The script in question is a translation form, displaying two languages which may use different charsets.
The code determines the appropriate charset to use for the html header from the two languages. If the charsets are the same, no problem. If they are different, it tests whther the mbstring module is enabled, in which case it uses 'UTF-8' for the html header; otherwise it uses the charset for the second language.

If mbstring was enabled, then the code uses the mb_convert_encoding function to convert the text strings for display to UTF-8...


Reproduce code:
---------------
$basecharset =  'ISO-8859-1';
$charset = 'ISO-8859-2';

$convertcharsets = ($basecharset != $charset);
if ($convertcharsets) {
   if (function_exists('mb_convert_encoding')) {
      formheader('UTF-8');
   } else {
      $convertcharsets = false;
      formheader($charset);
   }
} else {
   formheader($basecharset);
}

echo charsetText('Fran?ais',$convertcharset,$basecharset)
echo '<br />';
echo charsetText('Polska',$convertcharset,$charset)
echo '<br />';

function charsetText($text,$convertcharset,$fromcharset)
{
   $returntext = $text;
   if ($convertcharset) { $returntext = mb_convert_encoding($returntext,"UTF-8",$fromcharset); }
   return $returntext;
} // function charsetText()


Expected result:
----------------
Fran?ais
Polska



Actual result:
--------------
Fran?&sect;ais
Polska

The first odd character in FranXXais is A with a tilde; the second is the HTML &sect; character

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-06-17 05:42 UTC] sniper@php.net
Please provide a complete but short example script.

 [2003-06-17 06:37 UTC] moriyoshi@php.net
That looks more like the browser issue.

Probably you have to explicitly indicate the charset(encoding) of the page content being dispatched to the browser either by <META> tag or by "Content-Type" header;

-- example 1 --

<?php
$output_charset = "UTF-8";
header("Content-Type: text/html; charset=$output_charset");
?>
<html>
<body>
<?php
$iso_8859_1 = "Fran?ais";
print mb_convert_encoding($iso_8859_1, $output_charset, "iso-8859-1");
?>
</body>
</html>

-- example 2 --
<?php
$output_charset = "UTF-8";
?>
<html>
<head>
<?php print "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=$output_charset\">"; ?>
</head>
<body>
<?php
$iso_8859_1 = "Fran?ais";
print mb_convert_encoding($iso_8859_1, $output_charset, "iso-8859-1");
?>
</body>
</html>


 [2003-06-26 18:22 UTC] sniper@php.net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Open". Thank you.


 [2007-09-30 15:13 UTC] fkl at tfu dot jgfi
http://www.meta-fx.com
forex     ??ј?
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Wed Mar 27 01:01:27 2019 UTC