php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78987 Memory problems running finfo::buffer with PHP_CLI
Submitted: 2019-12-18 04:12 UTC Modified: 2021-09-19 14:48 UTC
Votes:15
Avg. Score:4.5 ± 0.7
Reproduced:14 of 14 (100.0%)
Same Version:11 (78.6%)
Same OS:7 (50.0%)
From: jhhillie at amazon dot com Assigned: ab (profile)
Status: Closed Package: Filesystem function related
PHP Version: 7.4Git-2019-12-18 (Git) OS: Ubuntu
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
1 + 34 = ?
Subscribe to this entry?

 
 [2019-12-18 04:12 UTC] jhhillie at amazon dot com
Description:
------------
When calling finfo::buffer() with a crafted string, PHP tries to allocate an insane amount of memory to do so. I found a similar bug here - https://bugs.php.net/bug.php?id=68819

I suspect the bug may have been re-introduced at some stage during newer versions.

1) I tested it on the version my customer had been using - v7.0.33 (doesn't work)
2) I then tested on the latest php version - v7.4.0 (doesn't work)
3) Testing on an older version - v5.4.16 it works perfectly.

Test script:
---------------
Generated a random file of 300MB using:
      $ sudo fallocate -l 300M test-file

The script used:

        <?php
	$content = file_get_contents('test-file', true);
	$fileInfo = new \finfo(FILEINFO_MIME_TYPE);
	var_dump($fileInfo->buffer($content));
	?>

Expected result:
----------------
To provide the file metadata (I know there are other ways to do this):

string(19) "application/x-empty"

Actual result:
--------------
PHP Warning:  finfo::buffer(): Failed identify data 12:cannot allocate 2516582408 bytes (Cannot allocate memory)application/octet-stream in php shell code on line 1 bool(false)

Patches

fix-encoding-memory-allocation-too-big (last revision 2021-06-23 06:48 UTC by andrei at davisinfo dot ro)

Add a Patch

Pull Requests

Pull requests:

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-12-18 04:51 UTC] requinix@php.net
-Status: Open +Status: Feedback -Package: Performance problem +Package: Filesystem function related
 [2019-12-18 04:51 UTC] requinix@php.net
> Failed identify data 12:cannot allocate 2516582408 bytes (Cannot allocate memory)application/octet-stream
According to that, libmagic is the one failing.

Working for me with PHP 7.0-7.4 on Ubuntu.

Does it actually include "application/octet-stream" in there? That is the correct MIME type...
I assume the system is, in fact, running out of memory? What version of libmagic?
 [2019-12-18 10:58 UTC] nikic@php.net
-Status: Feedback +Status: Verified
 [2019-12-18 10:58 UTC] nikic@php.net
I can reproduce the large allocation. It is caused by https://github.com/php/php-src/blob/aadd5e69e01728adb78d03026beb9a9a0c7e75ef/ext/fileinfo/libmagic/encoding.c#L92.

Basically, libmagic tries to detect the encoding of the file and does this by ... allocating a buffer of unicode characters for the whole file.

For some reason this also uses "unsigned long" instead of "uint32_t" for each character, which means the allocation is not just 4 times as large, but 8 times as large as the original file!

Current upstream version still looks about the same: https://github.com/file/file/blob/master/src/encoding.c

Ideally this issue would be reported upstream and fixed there first.
 [2020-06-12 07:12 UTC] magnar at myrtveit dot com
I've created two scripts that illustrate the problem with finfo_buffer:

Memory usage of finfo_file: https://3v4l.org/M3YLG
Memory usage of finfo_buffer: https://3v4l.org/LNNoV

On a 16M file, finfo_file uses 20M memory. finfo_buffer, however, uses 244M!
 [2021-02-04 11:00 UTC] halaeiv at gmail dot com
I have a quick question. If it is libmagic bug how come the bug exists in php-7.4 but not php-7.2? Both are installed via ondrej/php in Ubuntu and both packages depend on the same libmagic1. If 2 php versions depends on the same shared library and one has a bug, doesn't this mean the bug is actually not in the shared library?
 [2021-02-06 06:58 UTC] halaeiv at gmail dot com
I have reported to bug to file package https://bugs.astron.com/view.php?id=234 and they fixed it really quickly:

"file_buffer(3) passed the full size of the buffer to the encoding
determination function. If the file was too large, we ended up
allocating (2 * size + 4 * size) buffers to scan for encoding. Now
we limit size to 64K."


Can we have this fix in PHP soon?

Thanks
 [2021-06-16 11:07 UTC] info at it-can dot nl
any news on this bug and the fix?
 [2021-06-23 06:48 UTC] andrei at davisinfo dot ro
The following patch has been added/updated:

Patch Name: fix-encoding-memory-allocation-too-big
Revision:   1624430915
URL:        https://bugs.php.net/patch-display.php?bug=78987&patch=fix-encoding-memory-allocation-too-big&revision=1624430915
 [2021-06-23 07:01 UTC] andrei at davisinfo dot ro
It was fixed on master because code was updated to file version 5.40 which fixes this issue but I think applying the fix to earlier versions would be a good idea also.
 [2021-06-23 08:40 UTC] andrei at davisinfo dot ro
The following pull request has been associated:

Patch Name: Fix #78987 - Memory problems running finfo::buffer
On GitHub:  https://github.com/php/php-src/pull/7187
Patch:      https://github.com/php/php-src/pull/7187.patch
 [2021-06-23 08:46 UTC] andrei at davisinfo dot ro
The following pull request has been associated:

Patch Name: Fix #78987 - Memory problems running finfo::buffer
On GitHub:  https://github.com/php/php-src/pull/7188
Patch:      https://github.com/php/php-src/pull/7188.patch
 [2021-09-19 14:48 UTC] ab@php.net
-Status: Verified +Status: Closed -Assigned To: +Assigned To: ab
 [2021-09-19 14:48 UTC] ab@php.net
Fixed in 7.4 and 8.0.

Thanks
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 23:01:26 2024 UTC