php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78987 Memory problems running finfo::buffer with PHP_CLI
Submitted: 2019-12-18 04:12 UTC Modified: 2019-12-18 10:58 UTC
Votes:11
Avg. Score:4.4 ± 0.8
Reproduced:10 of 10 (100.0%)
Same Version:7 (70.0%)
Same OS:6 (60.0%)
From: jhhillie at amazon dot com Assigned:
Status: Verified Package: Filesystem function related
PHP Version: 7.4Git-2019-12-18 (Git) OS: Ubuntu
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: jhhillie at amazon dot com
New email:
PHP Version: OS:

 

 [2019-12-18 04:12 UTC] jhhillie at amazon dot com
Description:
------------
When calling finfo::buffer() with a crafted string, PHP tries to allocate an insane amount of memory to do so. I found a similar bug here - https://bugs.php.net/bug.php?id=68819

I suspect the bug may have been re-introduced at some stage during newer versions.

1) I tested it on the version my customer had been using - v7.0.33 (doesn't work)
2) I then tested on the latest php version - v7.4.0 (doesn't work)
3) Testing on an older version - v5.4.16 it works perfectly.

Test script:
---------------
Generated a random file of 300MB using:
      $ sudo fallocate -l 300M test-file

The script used:

        <?php
	$content = file_get_contents('test-file', true);
	$fileInfo = new \finfo(FILEINFO_MIME_TYPE);
	var_dump($fileInfo->buffer($content));
	?>

Expected result:
----------------
To provide the file metadata (I know there are other ways to do this):

string(19) "application/x-empty"

Actual result:
--------------
PHP Warning:  finfo::buffer(): Failed identify data 12:cannot allocate 2516582408 bytes (Cannot allocate memory)application/octet-stream in php shell code on line 1 bool(false)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-12-18 04:51 UTC] requinix@php.net
-Status: Open +Status: Feedback -Package: Performance problem +Package: Filesystem function related
 [2019-12-18 04:51 UTC] requinix@php.net
> Failed identify data 12:cannot allocate 2516582408 bytes (Cannot allocate memory)application/octet-stream
According to that, libmagic is the one failing.

Working for me with PHP 7.0-7.4 on Ubuntu.

Does it actually include "application/octet-stream" in there? That is the correct MIME type...
I assume the system is, in fact, running out of memory? What version of libmagic?
 [2019-12-18 10:58 UTC] nikic@php.net
-Status: Feedback +Status: Verified
 [2019-12-18 10:58 UTC] nikic@php.net
I can reproduce the large allocation. It is caused by https://github.com/php/php-src/blob/aadd5e69e01728adb78d03026beb9a9a0c7e75ef/ext/fileinfo/libmagic/encoding.c#L92.

Basically, libmagic tries to detect the encoding of the file and does this by ... allocating a buffer of unicode characters for the whole file.

For some reason this also uses "unsigned long" instead of "uint32_t" for each character, which means the allocation is not just 4 times as large, but 8 times as large as the original file!

Current upstream version still looks about the same: https://github.com/file/file/blob/master/src/encoding.c

Ideally this issue would be reported upstream and fixed there first.
 [2020-06-12 07:12 UTC] magnar at myrtveit dot com
I've created two scripts that illustrate the problem with finfo_buffer:

Memory usage of finfo_file: https://3v4l.org/M3YLG
Memory usage of finfo_buffer: https://3v4l.org/LNNoV

On a 16M file, finfo_file uses 20M memory. finfo_buffer, however, uses 244M!
 [2021-02-04 11:00 UTC] halaeiv at gmail dot com
I have a quick question. If it is libmagic bug how come the bug exists in php-7.4 but not php-7.2? Both are installed via ondrej/php in Ubuntu and both packages depend on the same libmagic1. If 2 php versions depends on the same shared library and one has a bug, doesn't this mean the bug is actually not in the shared library?
 [2021-02-06 06:58 UTC] halaeiv at gmail dot com
I have reported to bug to file package https://bugs.astron.com/view.php?id=234 and they fixed it really quickly:

"file_buffer(3) passed the full size of the buffer to the encoding
determination function. If the file was too large, we ended up
allocating (2 * size + 4 * size) buffers to scan for encoding. Now
we limit size to 64K."


Can we have this fix in PHP soon?

Thanks
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Fri Apr 23 07:01:23 2021 UTC