php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #79481 data:// wrapper can hidden some characters
Submitted: 2020-04-16 05:47 UTC Modified: 2020-04-16 07:54 UTC
From: j7ur8 at qq dot com Assigned:
Status: Verified Package: Streams related
PHP Version: 7.4.4 OS: Windows Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: j7ur8 at qq dot com
New email:
PHP Version: OS:

 

 [2020-04-16 05:47 UTC] j7ur8 at qq dot com
Description:
------------
From `https://www.php.net/manual/en/wrappers.data.php#refsect1-wrappers.data-description`, i know it refer to RFC 2397. And i found it may not defined securely ?

From RFC 2397

Syntax:
       dataurl    := "data:" [ mediatype ] [ ";base64" ] "," data
       mediatype  := [ type "/" subtype ] *( ";" parameter )
       data       := *urlchar
       parameter  := attribute "=" value

and if <mediatype> is omitted, it defaults to `text/plain;charset=US-ASCII`. That means we can change it freely, such as `data:text/plain;charset=iso-8859-7,%be%fg%be` would use the iso-8859-7 to handle datas. But i do not think php  supports it wholly which results to characters can hide in the data wrapper stream.

Test script:
---------------
<?php
echo file_get_contents('data:,cc')."\n"; # valid
echo file_get_contents('data://asdc/asd;ccc=ccc,cc')."\n"; # with bad characters hide in

Expected result:
----------------
cc 
cc

Actual result:
--------------
cc 
cc

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-04-16 05:50 UTC] stas@php.net
-Type: Security +Type: Bug
 [2020-04-16 05:50 UTC] stas@php.net
Doesn't look like there's any security issue here.
 [2020-04-16 07:54 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2020-04-16 07:54 UTC] cmb@php.net
> That means we can change it freely, such as
> `data:text/plain;charset=iso-8859-7,%be%fg%be` would use the
> iso-8859-7 to handle datas.

No, that is not the case.  Unless the base64 flag is set, the data
are just urldecode()d, and any desired/required character encoding
conversion has to be done by the application, which can retrieve
the specified charset from the stream meta data[1].

Given that PHP strings are actually byte arrays, this is pretty
much to be expected, but should be documented nonetheless.

[1] <https://3v4l.org/uGqjf>
 [2020-04-16 07:54 UTC] cmb@php.net
-Type: Bug +Type: Documentation Problem
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 16:01:28 2024 UTC