php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #80940 async function addFromString work incorrect
Submitted: 2021-04-07 19:00 UTC Modified: 2021-04-08 10:45 UTC
From: skuratovichalex at gmail dot com Assigned:
Status: Verified Package: Zip Related
PHP Version: 7.4.16 OS: Debian 9
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: skuratovichalex at gmail dot com
New email:
PHP Version: OS:

 

 [2021-04-07 19:00 UTC] skuratovichalex at gmail dot com
Description:
------------
The essence of the problem is that when using the ZipArchive methods asynchronously, at the output I get the required archive, but only with the last processed file within the framework of asynchrony (To be more precise, the archive contains N / t files, where N is the total files to be added , t is the number of processing threads)
PS-The initial presence of the archive does not change the situation.

Example: (6 files)
https://drive.google.com/file/d/1BAHn4iSElaADwfxQ_3_ZaTOmPKvggwH6/view?usp=sharing

Test script:
---------------
//one stream
$za = new \ZipArchive();
 ($za->open($zipFile, \ZipArchive::CREATE) !== TRUE) {
      throw new \Exception('Cannot create a zip file');
 } 
$arFile = $dir . '/' . $newFileName . '.json';
$result = $za->addFromString($arFile, $stringData);
$error = $za->getStatusString( );
$arIndex = $za->locateName($arFile);
$arInfo = $za->statName($arFile);
$closeResult = $za->close();
 var_dump($error, $arIndex, $arInfo, $closeResult);

Expected result:
----------------
file2
file1
file3
file4
string(8) "No error"
int(0)
array(8) {
  ["name"]=>
  string(17) "/order/file2.json"
  ["index"]=>
  int(0)
  ["crc"]=>
  int(0)
  ["size"]=>
  int(811758)
  ["mtime"]=>
  int(1617821517)
  ["comp_size"]=>
  int(811758)
  ["comp_method"]=>
  int(0)
  ["encryption_method"]=>
  int(0)
}
bool(true)
string(8) "No error"
int(0)
array(8) {
  ["name"]=>
  string(17) "/order/file3.json"
  ["index"]=>
  int(0)
  ["crc"]=>
  int(0)
  ["size"]=>
  int(811758)
  ["mtime"]=>
  int(1617821517)                                                                                                                                                                                                  
  ["comp_size"]=>                                                                                                                                                                                                  
  int(811758)                                                                                                                                                                                                      
  ["comp_method"]=>                                                                                                                                                                                                
  int(0)                                                                                                                                                                                                           
  ["encryption_method"]=>                                                                                                                                                                                          
  int(0)                                                                                                                                                                                                           
}                                                                                                                                                                                                                  
bool(true)                                                                                                                                                                                                         
string(8) "No error"                                                                                                                                                                                               
int(0)                                                                                                                                                                                                             
array(8) {
  ["name"]=>
  string(17) "/order/file4.json"
  ["index"]=>
  int(0)
  ["crc"]=>
  int(0)
  ["size"]=>
  int(811758)
  ["mtime"]=>
  int(1617821517)
  ["comp_size"]=>
  int(811758)
  ["comp_method"]=>
  int(0)
  ["encryption_method"]=>
  int(0)
}
bool(true)
string(8) "No error"
int(0)
array(8) {
  ["name"]=>
  string(17) "/order/file1.json"
  ["index"]=>
  int(0)
  ["crc"]=>
  int(0)
  ["size"]=>
  int(811758)
  ["mtime"]=>
  int(1617821517)
  ["comp_size"]=>
  int(811758)
  ["comp_method"]=>
  int(0)
  ["encryption_method"]=>
  int(0)
}
bool(true)


Actual result:
--------------
4 JSON-files in ZIP-archive.

Patches

async-add-fromstring-wrong (last revision 2021-04-07 19:08 UTC by skuratovichalex at gmail dot com)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-04-07 19:08 UTC] skuratovichalex at gmail dot com
The following patch has been added/updated:

Patch Name: async-add-fromstring-wrong
Revision:   1617822491
URL:        https://bugs.php.net/patch-display.php?bug=80940&patch=async-add-fromstring-wrong&revision=1617822491
 [2021-04-07 19:40 UTC] sjoerd@php.net
As I understand it, you run the test script several times concurrently, everything seems to succeed, except the resulting ZIP file contains less files than expected. This seems like a classic race condition. It seems that ZipArchive does not handle the case where the ZIP file is modified between calls to open and close. Perhaps you can use flock or similar file locking to avoid multiple processes opening the ZIP file at the same time.
 [2021-04-08 10:45 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2021-04-08 10:45 UTC] cmb@php.net
> It seems that ZipArchive does not handle the case where the ZIP
> file is modified between calls to open and close.

Actually, ZipArchive does not really touch the ZIP file before
::close() is called.  Only when ::close() is called, the ZIP is
written.  This stems from the underlying libzip which is
implemented this way for performance reasons (adding individual
files to a ZIP archive would be way slower).  Thus, you can't
write to the same archive from multiple process/threads
concurrently, and this is not necessary to improve performance
anyway.

This also implies, that whenever you want to successfully add a
file to an archive, that file needs still to exist when close is
called.

I think this needs to be documented.
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Mon May 17 10:01:24 2021 UTC