php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #67055 output of json_encode does not comply with the definition
Submitted: 2014-04-10 12:50 UTC Modified: 2014-04-25 16:51 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: mail+bugs dot php dot net at kazik dot de Assigned:
Status: Not a bug Package: JSON related
PHP Version: Irrelevant OS: All
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: mail+bugs dot php dot net at kazik dot de
New email:
PHP Version: OS:

 

 [2014-04-10 12:50 UTC] mail+bugs dot php dot net at kazik dot de
Description:
------------
The function json_encode does not encode strings accordingly to the json definition. It affects at least php since 5.3.28 up to the latest version (currently 5.5.11).

According to the definition a string may not contain a quote, slash or control character.
Control characters are the c0 set (0x00-0x1f), delete (0x7f) and the c1 set (0x80-0x9f) (see http://en.wikipedia.org/wiki/Unicode_control_characters).

Source: ext/json/json.c, function json_escape_string

The function only checks for the c0 set but does not handle delete and the c1 set correctly.

The c1 set bug is only affected with the option JSON_UNESCAPED_UNICODE (since php 5.4.0).


Test script:
---------------
echo json_encode(chr(0x7f)).PHP_EOL;

echo json_encode(chr(0xc2).chr(0x80)), JSON_UNESCAPED_UNICODE).PHP_EOL; // the utf8 representation of \u0080


Expected result:
----------------
"\u007f"

"\u0080"


Actual result:
--------------
'"'.chr(0x7f).'"'

'"'.chr(0xc2).chr(0x80).'"' // the utf8 representation of /u0080


Patches

patch_json.diff (last revision 2014-04-10 12:51 UTC by mail+bugs dot php dot net at kazik dot de)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-04-25 01:22 UTC] pleasestand at live dot com
> The function json_encode does not encode strings accordingly to the json definition.

In RFC 7159, section "7. Strings" defines the "control characters" as only "U+0000 through U+001F". More specifically, the characters you mention are explicitly allowed to be left unescaped:

    unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

Note that JavaScript's JSON.stringify() doesn't escape "\u007f" either.
 [2014-04-25 16:51 UTC] aharvey@php.net
-Status: Open +Status: Not a bug
 [2014-04-25 16:51 UTC] aharvey@php.net
Yep; this is correct per the JSON spec.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri May 09 15:01:27 2025 UTC