php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46370 translation mixup
Submitted: 2008-10-23 11:17 UTC Modified: 2008-10-23 15:08 UTC
From: nikitin at freshframes dot com Assigned:
Status: Not a bug Package: Filesystem function related
PHP Version: 5.2.6 OS: debian 4.0
Private report: No CVE-ID: None
 [2008-10-23 11:17 UTC] nikitin at freshframes dot com
Description:
------------
i upload a latin1 file to the debian server (move_uploaded_file)

this file is stored in utf8 (using utf8 translation in putty shows me correct chars, using latin1 translation shows me the multibyte chars...)

then i read the file into a var via file_get_contents and wonder why
mb_check_encoding( var, "UTF-8" ) returns false.

sending the var via browser to client with utf8 charset produces wrong data, so i probably have latin1 data in my var.

now i use utf8_encode( var ) and everything works fine.

what do i need to set to get utf8 data from file reads, or why do i need to encode it again?


Reproduce code:
---------------
move_uploaded_file( $_FILES['file']['tmp_name'][0], $file );

$data = file_get_contents( $file );

var_dump( array( 
  mb_check_encoding( $data, "UTF-8" ), 
  mb_check_encoding( utf8_encode( $data ), "UTF-8" ) );





Expected result:
----------------
[ true, true ]

Actual result:
--------------
[ false, true ]

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-10-23 12:09 UTC] jani@php.net
If you upload as latin1 it's stored as latin1, set your charsets properly in the upload page and it works as expected. (fyi: file_get_contents doesn't convert anything to anything, it's binary safe)
 [2008-10-23 13:43 UTC] nikitin at freshframes dot com
i am pretty sure, that i use utf8 in the upload page.
my framework is joomla and it uses utf8


when i download the uploaded file the umlauts are corrupted -> utf8 encoded


so there is still the question why i need to encode it in php again.
 [2008-10-23 15:08 UTC] nikitin at freshframes dot com
ok, i just figured out the problem, seems i encode the files twice to utf8..

now i removed all encodings and the file is uploaded in latin1.
once encoded it compares data like i need with mysql utf8 columns.

so, this here is fixed and may be closed.

sorry and thanks for your support.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Apr 24 00:01:32 2024 UTC