php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46370 translation mixup
Submitted: 2008-10-23 11:17 UTC Modified: 2008-10-23 15:08 UTC
From: nikitin at freshframes dot com Assigned:
Status: Not a bug Package: Filesystem function related
PHP Version: 5.2.6 OS: debian 4.0
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: nikitin at freshframes dot com
New email:
PHP Version: OS:

 

 [2008-10-23 11:17 UTC] nikitin at freshframes dot com
Description:
------------
i upload a latin1 file to the debian server (move_uploaded_file)

this file is stored in utf8 (using utf8 translation in putty shows me correct chars, using latin1 translation shows me the multibyte chars...)

then i read the file into a var via file_get_contents and wonder why
mb_check_encoding( var, "UTF-8" ) returns false.

sending the var via browser to client with utf8 charset produces wrong data, so i probably have latin1 data in my var.

now i use utf8_encode( var ) and everything works fine.

what do i need to set to get utf8 data from file reads, or why do i need to encode it again?


Reproduce code:
---------------
move_uploaded_file( $_FILES['file']['tmp_name'][0], $file );

$data = file_get_contents( $file );

var_dump( array( 
  mb_check_encoding( $data, "UTF-8" ), 
  mb_check_encoding( utf8_encode( $data ), "UTF-8" ) );





Expected result:
----------------
[ true, true ]

Actual result:
--------------
[ false, true ]

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-10-23 12:09 UTC] jani@php.net
If you upload as latin1 it's stored as latin1, set your charsets properly in the upload page and it works as expected. (fyi: file_get_contents doesn't convert anything to anything, it's binary safe)
 [2008-10-23 13:43 UTC] nikitin at freshframes dot com
i am pretty sure, that i use utf8 in the upload page.
my framework is joomla and it uses utf8


when i download the uploaded file the umlauts are corrupted -> utf8 encoded


so there is still the question why i need to encode it in php again.
 [2008-10-23 15:08 UTC] nikitin at freshframes dot com
ok, i just figured out the problem, seems i encode the files twice to utf8..

now i removed all encodings and the file is uploaded in latin1.
once encoded it compares data like i need with mysql utf8 columns.

so, this here is fixed and may be closed.

sorry and thanks for your support.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Jul 04 11:01:37 2025 UTC