php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78987 Memory problems running finfo::buffer with PHP_CLI
Submitted: 2019-12-18 04:12 UTC Modified: 2021-09-19 14:48 UTC
Votes:15
Avg. Score:4.5 ± 0.7
Reproduced:14 of 14 (100.0%)
Same Version:11 (78.6%)
Same OS:7 (50.0%)
From: jhhillie at amazon dot com Assigned: ab (profile)
Status: Closed Package: Filesystem function related
PHP Version: 7.4Git-2019-12-18 (Git) OS: Ubuntu
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: jhhillie at amazon dot com
New email:
PHP Version: OS:

 

 [2019-12-18 04:12 UTC] jhhillie at amazon dot com
Description:
------------
When calling finfo::buffer() with a crafted string, PHP tries to allocate an insane amount of memory to do so. I found a similar bug here - https://bugs.php.net/bug.php?id=68819

I suspect the bug may have been re-introduced at some stage during newer versions.

1) I tested it on the version my customer had been using - v7.0.33 (doesn't work)
2) I then tested on the latest php version - v7.4.0 (doesn't work)
3) Testing on an older version - v5.4.16 it works perfectly.

Test script:
---------------
Generated a random file of 300MB using:
      $ sudo fallocate -l 300M test-file

The script used:

        <?php
	$content = file_get_contents('test-file', true);
	$fileInfo = new \finfo(FILEINFO_MIME_TYPE);
	var_dump($fileInfo->buffer($content));
	?>

Expected result:
----------------
To provide the file metadata (I know there are other ways to do this):

string(19) "application/x-empty"

Actual result:
--------------
PHP Warning:  finfo::buffer(): Failed identify data 12:cannot allocate 2516582408 bytes (Cannot allocate memory)application/octet-stream in php shell code on line 1 bool(false)

Patches

fix-encoding-memory-allocation-too-big (last revision 2021-06-23 06:48 UTC by andrei at davisinfo dot ro)

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-12-18 04:51 UTC] requinix@php.net
-Status: Open +Status: Feedback -Package: Performance problem +Package: Filesystem function related
 [2019-12-18 04:51 UTC] requinix@php.net
> Failed identify data 12:cannot allocate 2516582408 bytes (Cannot allocate memory)application/octet-stream
According to that, libmagic is the one failing.

Working for me with PHP 7.0-7.4 on Ubuntu.

Does it actually include "application/octet-stream" in there? That is the correct MIME type...
I assume the system is, in fact, running out of memory? What version of libmagic?
 [2019-12-18 10:58 UTC] nikic@php.net
-Status: Feedback +Status: Verified
 [2019-12-18 10:58 UTC] nikic@php.net
I can reproduce the large allocation. It is caused by https://github.com/php/php-src/blob/aadd5e69e01728adb78d03026beb9a9a0c7e75ef/ext/fileinfo/libmagic/encoding.c#L92.

Basically, libmagic tries to detect the encoding of the file and does this by ... allocating a buffer of unicode characters for the whole file.

For some reason this also uses "unsigned long" instead of "uint32_t" for each character, which means the allocation is not just 4 times as large, but 8 times as large as the original file!

Current upstream version still looks about the same: https://github.com/file/file/blob/master/src/encoding.c

Ideally this issue would be reported upstream and fixed there first.
 [2020-06-12 07:12 UTC] magnar at myrtveit dot com
I've created two scripts that illustrate the problem with finfo_buffer:

Memory usage of finfo_file: https://3v4l.org/M3YLG
Memory usage of finfo_buffer: https://3v4l.org/LNNoV

On a 16M file, finfo_file uses 20M memory. finfo_buffer, however, uses 244M!
 [2021-02-04 11:00 UTC] halaeiv at gmail dot com
I have a quick question. If it is libmagic bug how come the bug exists in php-7.4 but not php-7.2? Both are installed via ondrej/php in Ubuntu and both packages depend on the same libmagic1. If 2 php versions depends on the same shared library and one has a bug, doesn't this mean the bug is actually not in the shared library?
 [2021-02-06 06:58 UTC] halaeiv at gmail dot com
I have reported to bug to file package https://bugs.astron.com/view.php?id=234 and they fixed it really quickly:

"file_buffer(3) passed the full size of the buffer to the encoding
determination function. If the file was too large, we ended up
allocating (2 * size + 4 * size) buffers to scan for encoding. Now
we limit size to 64K."


Can we have this fix in PHP soon?

Thanks
 [2021-06-16 11:07 UTC] info at it-can dot nl
any news on this bug and the fix?
 [2021-06-23 06:48 UTC] andrei at davisinfo dot ro
The following patch has been added/updated:

Patch Name: fix-encoding-memory-allocation-too-big
Revision:   1624430915
URL:        https://bugs.php.net/patch-display.php?bug=78987&patch=fix-encoding-memory-allocation-too-big&revision=1624430915
 [2021-06-23 07:01 UTC] andrei at davisinfo dot ro
It was fixed on master because code was updated to file version 5.40 which fixes this issue but I think applying the fix to earlier versions would be a good idea also.
 [2021-06-23 08:40 UTC] andrei at davisinfo dot ro
The following pull request has been associated:

Patch Name: Fix #78987 - Memory problems running finfo::buffer
On GitHub:  https://github.com/php/php-src/pull/7187
Patch:      https://github.com/php/php-src/pull/7187.patch
 [2021-06-23 08:46 UTC] andrei at davisinfo dot ro
The following pull request has been associated:

Patch Name: Fix #78987 - Memory problems running finfo::buffer
On GitHub:  https://github.com/php/php-src/pull/7188
Patch:      https://github.com/php/php-src/pull/7188.patch
 [2021-09-19 14:48 UTC] ab@php.net
-Status: Verified +Status: Closed -Assigned To: +Assigned To: ab
 [2021-09-19 14:48 UTC] ab@php.net
Fixed in 7.4 and 8.0.

Thanks
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Nov 12 07:01:31 2024 UTC