php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #45594 json_encode looses accentuated characters
Submitted: 2008-07-22 15:32 UTC Modified: 2008-07-23 09:11 UTC
From: neonira at gmail dot com Assigned:
Status: Not a bug Package: JSON related
PHP Version: 5.2.6 OS: windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: neonira at gmail dot com
New email:
PHP Version: OS:

 

 [2008-07-22 15:32 UTC] neonira at gmail dot com
Description:
------------
I checked that the file encoding character set for the php source file is UTF-8. So it means, I provide a real UTF-8 entry, and I don't need to use ut8_encode. 

It seems that accentuated caracters are truncated or not well managed while json-encoding a string.



Reproduce code:
---------------
<?php

setlocale(LC_ALL, "fr");
class CT {
  var $link;
  var $str;
  var $arr;

  function document() {
     $this->link = "http://www.neonira.com";
     $this->str = "cha?ne avec caract?res accentu?s ... ";
     $this->arr = array(1, 3, "lala", "123", array ( 'a', 'b', 'c'), 134);
  }
}

$c = new CT();
$c->document();
echo json_encode($c);

?>

Expected result:
----------------
shell>php ../php/ct.php
{"link":"http:\/\/www.neonira.com","str":"cha?ne avec caract?res accentu?s ... ","arr":[1,3,"lala","123",["a","b","c"],134]}shell>

Actual result:
--------------
shell>{107}php ../php/ct.php
{"link":"http:\/\/www.neonira.com","str":"cha","arr":[1,3,"lala","123",["a","b","c"],134]}shell>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-07-22 22:04 UTC] jani@php.net
RTFM: "This function only works with UTF-8 encoded data."
 [2008-07-22 22:05 UTC] jani@php.net
Hint: You're not passing utf-8 here.. setlocale..
 [2008-07-23 09:11 UTC] neonira at gmail dot com
If I understand you well, a php source file, utf-8 encoded, with embedded accentuated strings, brings the weird result that strings are not UTF-8 encoded.

Is this what should really be understood from your answer?  Is it representative of the default PHP behavior, or is it just a local bug?

What correction may I do to my code to get the right result?
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC