php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78655 Substitute character (^Z) in PDF breaks mime type detection
Submitted: 2019-10-09 11:34 UTC Modified: 2019-11-11 09:54 UTC
Votes:3
Avg. Score:5.0 ± 0.0
Reproduced:3 of 3 (100.0%)
Same Version:3 (100.0%)
Same OS:3 (100.0%)
From: dick at tellow dot nl Assigned: cmb (profile)
Status: Not a bug Package: *Directory/Filesystem functions
PHP Version: 7.1.32 OS: macOS 10.15
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: dick at tellow dot nl
New email:
PHP Version: OS:

 

 [2019-10-09 11:34 UTC] dick at tellow dot nl
Description:
------------
Substitute character (^Z) in PDF breaks mime type detection.


If the broken.pdf file contains:

```
%PDF-1.3^M$
%�����������^M$
4 0 obj^M$
<< /Length 2625 /Filter [ /FlateDecode ] >>^M$
stream^M$
x��^Z$
Ug�$
$
%%EOF$
```

getting the MIME type via `finfo` or `mime_content_type` returns the incorrect 'application/octet-stream' instead of the expected 'application/pdf'.

Problem occurs on localhost macOS and remote server running Linux.

Test script:
---------------
$finfo = new \finfo(FILEINFO_MIME_TYPE);
var_dump($finfo->file('broken.pdf'));
= string(24) "application/octet-stream"

Expected result:
----------------
application/pdf

Actual result:
--------------
application/octet-stream

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-10-09 12:00 UTC] leeuwd at me dot com
Substitute character (https://en.wikipedia.org/wiki/Substitute_character): "In Unix operating systems, this character is typically used to suspend the currently executing interactive process"

Could be the culprit.
 [2019-10-13 18:39 UTC] cmb@php.net
What does file(1) report?
 [2019-10-21 08:06 UTC] dick at tellow dot nl
var_dump(file('broken.pdf')) returns

array(9) { [0]=> string(13) "%PDF-1.3 " [1]=> string(36) "%����������� " [2]=> string(9) "4 0 obj " [3]=> string(45) "<< /Length 2625 /Filter [ /FlateDecode ] >> " [4]=> string(8) "stream " [5]=> string(9) "x�� " [6]=> string(6) "Ug� " [7]=> string(1) " " [8]=> string(6) "%%EOF " }
 [2019-10-21 14:02 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2019-10-21 14:02 UTC] requinix@php.net
@cmb means the file command run from the terminal.

$ file broken.pdf
 [2019-10-21 14:07 UTC] dick at tellow dot nl
file broken.pdf
=> broken.pdf: data
 [2019-11-03 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 [2019-11-04 09:16 UTC] cmb@php.net
-Status: No Feedback +Status: Not a bug -Package: *General Issues +Package: *Directory/Filesystem functions -Assigned To: +Assigned To: cmb
 [2019-11-04 09:16 UTC] cmb@php.net
> file broken.pdf
> => broken.pdf: data

So this looks like an upstream issue, and should reported to
<https://bugs.astron.com/my_view_page.php>.
 [2019-11-11 09:47 UTC] dick at tellow dot nl
The issue is still present! 

Can't re-open:this website from the 90s does not allow me to login nor request a new password.
 [2019-11-11 09:54 UTC] cmb@php.net
> The issue is still present!

Has it already been fixed upstream?
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Sun Nov 17 07:01:34 2019 UTC