php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75101 `PharData` caches file contents, even through file deletion
Submitted: 2017-08-21 09:29 UTC Modified: -
Votes:12
Avg. Score:4.2 ± 0.7
Reproduced:12 of 12 (100.0%)
Same Version:4 (33.3%)
Same OS:3 (25.0%)
From: d28b312d at opayq dot com Assigned:
Status: Open Package: PHAR related
PHP Version: Irrelevant OS: Windows 10 + Cygwin
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: d28b312d at opayq dot com
New email:
PHP Version: OS:

 

 [2017-08-21 09:29 UTC] d28b312d at opayq dot com
Description:
------------
 - PHP shipped with Cygwin (7.0.19), can't find anything in the patch notes to suggest this is fixed in newer versions.

`PharData` seems to cache file contents even when the file is no longer on the disk. Constructing a new `PharData` with the same filename will produce the same object as before. I understand there is an alias parameter, but this seems like totally unexpected behaviour especially as these functions seem to be the go-to functions for extracting tar archives these days.

Looking in the source code I see that this seems to be partially deliberate behaviour as `phar_open_parsed_phar` tries to open an already existing phar file, however, if the file has been deleted off disk and a new `PharData` object is created then surely new data should be loaded? Also does this mean that once a PharData object is created and a tar file opened that the entire contents of the tar is held in memory for the duration of the PHP script?

There seems to be no way to remove this caching, as shown in the test script the tgz file is removed and then opened again with a new file (with different contents), and yet the same file is opened somehow.

I've tried calling `unset()` on the `PharData` object before creating a new one or calling `clearstatcache()`, but to no avail.

Interestingly and of note, even though the default format is `Phar::TAR`, it seems that it auto-detects TGZ files and decompresses them fine.

Test script:
---------------
<?php

$url = 'http://thrysoee.dk/editline/libedit-%s-3.1.tar.gz';
$downloadFile = '/tmp/foo.tgz';

file_put_contents($downloadFile, fopen(sprintf($url, '20170329'), 'r'));
var_dump(new PharData($downloadFile));

unlink($downloadFile);

file_put_contents($downloadFile, fopen(sprintf($url, '20160903'), 'r'));
var_dump(new PharData($downloadFile));

Expected result:
----------------
I expect the second object to be the new file, as shown in the edited output below:

/tmp $ php /tmp/phpbugtest.php
object(PharData)#1 (4) {
  ["pathName":"SplFileInfo":private]=>
  string(40) "phar:///tmp/foo.tgz/libedit-20170329-3.1"
  ["fileName":"SplFileInfo":private]=>
  string(20) "libedit-20170329-3.1"
  ["glob":"DirectoryIterator":private]=>
  bool(false)
  ["subPathName":"RecursiveDirectoryIterator":private]=>
  string(0) ""
}
object(PharData)#2 (4) {
  ["pathName":"SplFileInfo":private]=>
  string(40) "phar:///tmp/foo.tgz/libedit-20160603-3.1"
  ["fileName":"SplFileInfo":private]=>
  string(20) "libedit-20160603-3.1"
  ["glob":"DirectoryIterator":private]=>
  bool(false)
  ["subPathName":"RecursiveDirectoryIterator":private]=>
  string(0) ""
}

Actual result:
--------------
/tmp $ php /tmp/phpbugtest.php
object(PharData)#1 (4) {
  ["pathName":"SplFileInfo":private]=>
  string(40) "phar:///tmp/foo.tgz/libedit-20170329-3.1"
  ["fileName":"SplFileInfo":private]=>
  string(20) "libedit-20170329-3.1"
  ["glob":"DirectoryIterator":private]=>
  bool(false)
  ["subPathName":"RecursiveDirectoryIterator":private]=>
  string(0) ""
}
object(PharData)#1 (4) {
  ["pathName":"SplFileInfo":private]=>
  string(40) "phar:///tmp/foo.tgz/libedit-20170329-3.1"
  ["fileName":"SplFileInfo":private]=>
  string(20) "libedit-20170329-3.1"
  ["glob":"DirectoryIterator":private]=>
  bool(false)
  ["subPathName":"RecursiveDirectoryIterator":private]=>
  string(0) ""
}

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-11-26 03:39 UTC] zeitgeist at ukr dot net
And in addition to this.

I use PharData to decompress downloaded .tar.gz files in infinite loop.

<?php

while (true)
{
$storagePrefix = '/home/myusername/storage'; // in my script this is another absolute file path
echo "Downloading...\n";
file_put_contents($storagePrefix.'/export_xml.tar.gz', file_get_contents('http://localhost/export/1402/630/export_xml.tar.gz')); // real url goes here...

echo "Decompressing...\n";
$p = new \PharData($storagePrefix.'/export_xml.tar.gz');
$p->decompress();
$this->comment('Extracting...');
$p = new \PharData($storagePrefix.'/export_xml.tar');
$p->extractTo($storagePrefix.'/export_xml');

// parsing logic goes here...

echo "Cleaning out...";
File::deleteDirectory($storagePrefix.'/export_xml');
File::delete($storagePrefix.'/export_xml.tar.gz');
File::delete($storagePrefix.'/export_xml.tar');
echo "Done.\n";

sleep(3);
}

/*
// lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.2 LTS
Release:        16.04
Codename:       xenial


PHP 7.1.4-1+deb.sury.org~xenial+1 (cli) (built: Apr 11 2017 22:12:32) ( NTS )
Copyright (c) 1997-2017 The PHP Group
Zend Engine v3.1.0, Copyright (c) 1998-2017 Zend Technologies
    with Zend OPcache v7.1.4-1+deb.sury.org~xenial+1, Copyright (c) 1999-2017, by Zend Technologies
*/

?>

On first loop cycle everything is ok - I get decompressed files in export_xml folder. Then I delete them in the end of first loop cycle.
But during the second loop cycle I see the following error:

 [BadMethodCallException]
  Unable to add newly converted phar "/home/myusername/storage/export_xml.tar" to the list of phars, a phar with that name already exists

This error related to string "$p->decompress();" . Looks like PharData think that export_xml.tar file exists at that moment, but the file does not.


I expect that PharData could successfully decompress newly downloded .tar.gz file, but it doesn't happen...
 [2018-04-20 08:56 UTC] inuyasha dot smith at dist dot info
I can confirm this bug. In my case I'm replacing a tar.gz file I previously opened and reopening it gives me the same output. Quick fix would be to copy the file to a temporary location so it has a different name and doesn't get cached.
 [2021-10-13 20:25 UTC] spl1nes dot com at googlemail dot com
Can confirm. PHP version 8.0.11.

I created a test.tar file with new \PharData(), then I deleted the test.tar file from the file system using \unlink() and then tried to create a new test.tar file with different content but received the error message "PharException: phar error: Cannot open phar archive ...". If I use different names I don't receive this error message.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Dec 06 10:01:28 2024 UTC