php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #48763 ZipArchive produces corrupt OpenOffice.org files
Submitted: 2009-07-01 23:48 UTC Modified: 2016-09-09 23:29 UTC
Votes:9
Avg. Score:4.6 ± 0.7
Reproduced:8 of 8 (100.0%)
Same Version:5 (62.5%)
Same OS:3 (37.5%)
From: dani dot church at gmail dot com Assigned: cmb (profile)
Status: Closed Package: Zip Related
PHP Version: 5.2CVS-2009-07-01 (snap) OS: CentOS 5
Private report: No CVE-ID: None
 [2009-07-01 23:48 UTC] dani dot church at gmail dot com
Description:
------------
When PHP writes a .od* file using ZipArchive, the resulting archive cannot be opened by OpenOffice.org.  The error it gives is "The file 'filename' is corrupt and therefore cannot be opened.  Should OpenOffice.org repair the file?"

Repairing the file in OO.o does not work.  However, unzipping the file from the command line (using unzip in OSX) works without error, and unzip -t (test integrity) reports no errors.  Furthermore, the extracted files are byte-correct, and zipping the extracted files into a new .od* file results in a valid OpenOffice.org file.

This bug is a regression since 5.2.6.  Using zip.so from 5.2.6 results in a valid file.  Using zip.so from 5.2CVS-2009-07-01 results in a corrupt file.

Reproduce code:
---------------
<?php
if (isset($_FILES['od'])) {
  $odfile = $_FILES['od']['tmp_name'];
  $zip = new ZipArchive();
  $zip->open($odfile);
  $xml = $zip->getFromName('content.xml');
  $zip->addFromString('content.xml', $xml);
  $zip->close();
  header('content-type: application/vnd.oasis.opendocument.spreadsheet');
  header('content-disposition: attachment;filename=test.ods');
  readfile($odfile);
  unlink($odfile);
  return;
}
?>
<form method="POST" enctype="multipart/form-data">
<input type='file' name='od'
<input type='submit'>
</form>

Expected result:
----------------
Using the above testbed (for ODS spreadsheets) to update content.xml with its own data should result in a valid ODS spreadsheet.

Actual result:
--------------
The resulting spreadsheet is corrupt and cannot be opened.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-07-02 00:13 UTC] dani dot church at gmail dot com
I sampled the compressed file using statName() three times: once before updating it, once after the addFromString() call, and once after closing the ZIP and reopening it from another object.  The results:

Original file: Array
(
    [name] => content.xml
    [index] => 11
    [crc] => 2790848447
    [size] => 16096
    [mtime] => 1246486396
    [comp_size] => 2379
    [comp_method] => 8
)
Compressed, not saved: Array
(
    [name] => content.xml
    [index] => 20
    [crc] => 0
    [size] => 16096
    [mtime] => 1246493476
    [comp_size] => -1
    [comp_method] => 0
)
Compressed and reopened: Array
(
    [name] => content.xml
    [index] => 19
    [crc] => 2790848447
    [size] => 16096
    [mtime] => 1246493476
    [comp_size] => 2355
    [comp_method] => 8
)

Notably, the compression method is the same, which means that PHP is not trying to use a compression method that OpenOffice.org does not support.
 [2009-07-02 22:02 UTC] jani@php.net
Can you try finding out what the difference is between the packages 
produced by 5.2.6 and latest release?
 [2009-07-02 22:35 UTC] dani dot church at gmail dot com
Here's a binary diff of the two zip files.  Summary: 5 places where an 0x08 byte changed to 0x00, and two places where an 0x06 byte changed to 0x44.  Given the values of comp_method in the stat calls, my wild guess is that comp_method somehow does not get stored properly in 5.2.8+.  I don't know the code, though.

If necessary, I can host the original files on my server for download.

--- empty-5.2.6.xxd     2009-07-02 18:27:28.000000000 -0400
+++ empty-5.2CVS2009-07-01.xxd  2009-07-02 18:27:28.000000000 -0400
@@ -7,7 +7,7 @@
 0000060: e23a 0000 0000 0000 0000 0000 0000 1a00  .:..............
 0000070: 0000 436f 6e66 6967 7572 6174 696f 6e73  ..Configurations
 0000080: 322f 7374 6174 7573 6261 722f 504b 0304  2/statusbar/PK..
-0000090: 1400 0808 0800 00b3 e23a 0000 0000 0200  .........:......
+0000090: 1400 0008 0800 00b3 e23a 0000 0000 0200  .........:......
 00000a0: 0000 0000 0000 2700 0000 436f 6e66 6967  ......'...Config
 00000b0: 7572 6174 696f 6e73 322f 6163 6365 6c65  urations2/accele
 00000c0: 7261 746f 722f 6375 7272 656e 742e 786d  rator/current.xm
@@ -32,7 +32,7 @@
 00001f0: 0000 b3e2 3a00 0000 0000 0000 0000 0000  ....:...........
 0000200: 001f 0000 0043 6f6e 6669 6775 7261 7469  .....Configurati
 0000210: 6f6e 7332 2f69 6d61 6765 732f 4269 746d  ons2/images/Bitm
-0000220: 6170 732f 504b 0304 1400 0808 0800 00b3  aps/PK..........
+0000220: 6170 732f 504b 0304 1400 0008 0800 00b3  aps/PK..........
 0000230: e23a 8d6c f030 aa05 0000 3619 0000 0a00  .:.l.0....6.....
 0000240: 0000 7374 796c 6573 2e78 6d6c dd59 6d6f  ..styles.xml.Ymo
 0000250: db36 10fe be5f 2128 43b1 0293 253b 6913  .6..._!(C...%;i.
@@ -171,7 +171,7 @@
 0000aa0: 6765 6e65 7261 746f 723e 3c2f 6f66 6669  generator></offi
 0000ab0: 6365 3a6d 6574 613e 3c2f 6f66 6669 6365  ce:meta></office
 0000ac0: 3a64 6f63 756d 656e 742d 6d65 7461 3e50  :document-meta>P
-0000ad0: 4b03 0414 0008 0808 0000 b3e2 3a70 6c4c  K...........:plL
+0000ad0: 4b03 0414 0000 0808 0000 b3e2 3a70 6c4c  K...........:plL
 0000ae0: 647b 0000 00f8 0200 0018 0000 0054 6875  d{...........Thu
 0000af0: 6d62 6e61 696c 732f 7468 756d 626e 6169  mbnails/thumbnai
 0000b00: 6c2e 706e 67eb 0cf0 73e7 e592 e262 6060  l.png...s....b``
@@ -182,7 +182,7 @@
 0000b50: c112 c939 26cd 93d3 0a2e f1f8 cd3a e869  ...9&........:.i
 0000b60: 6c91 392a 382a 38e0 82b7 582f 7e64 fb77  l.9*8*8...X/~d.w
 0000b70: 4d82 b51b 94a2 3d5d fd5c d639 2534 0100  M.....=].\.9%4..
-0000b80: 504b 0304 1400 0808 0800 00b3 e23a 2124  PK...........:!$
+0000b80: 504b 0304 1400 0008 0800 00b3 e23a 2124  PK...........:!$
 0000b90: 3734 8e03 0000 e01b 0000 0c00 0000 7365  74............se
 0000ba0: 7474 696e 6773 2e78 6d6c ed99 5153 e230  ttings.xml..QS.0
 0000bb0: 10c7 dfef 5330 7dc7 0272 dcd9 111c c4f1  ....S0}..r......
@@ -241,7 +241,7 @@
 0000f00: 5f4d 10f6 19d3 c536 8aa2 9f38 1910 ee00  _M.....6...8....
 0000f10: aba0 e022 7147 ba1a 2267 cb0a fc34 9467  ..."qG.."g...4.g
 0000f20: ab0f 51b6 0eb2 e24c f086 89c2 d3a5 f9ea  ..Q....L........
-0000f30: 2394 59f4 79ae f71f 504b 0304 1400 0808  #.Y.y...PK......
+0000f30: 2394 59f4 79ae f71f 504b 0304 1400 0008  #.Y.y...PK......
 0000f40: 0800 00b3 e23a 4154 fb43 4901 0000 6807  .....:AT.CI...h.
 0000f50: 0000 1500 0000 4d45 5441 2d49 4e46 2f6d  ......META-INF/m
 0000f60: 616e 6966 6573 742e 786d 6cb5 9541 6ec3  anifest.xml..An.
@@ -265,7 +265,7 @@
 0001080: 7bf5 2e3c f9f9 7fcd fc3b 97f8 6490 66c7  {..<.....;..d.f.
 0001090: 0ec8 30db c1b3 ddc7 6167 411b 927c 1956  ..0.....agA..|.V
 00010a0: def6 73c3 e72d 2c32 a7eb f25a da95 fc71  ..s..-,2...Z...q
-00010b0: 5bae 3f01 504b 0304 1400 0000 0800 0693  [.?.PK..........
+00010b0: 5bae 3f01 504b 0304 1400 0000 0800 4493  [.?.PK........D.
 00010c0: e23a 75e5 4403 7003 0000 6f0e 0000 0b00  .:u.D.p...o.....
 00010d0: 0000 636f 6e74 656e 742e 786d 6ce5 574d  ..content.xml.WM
 00010e0: 8fdb 3610 bde7 5718 2ad0 1bcd b5b7 4176  ..6...W.*.....Av
@@ -382,7 +382,7 @@
 00017d0: 54fb 4349 0100 0068 0700 0015 0000 0000  T.CI...h........
 00017e0: 0000 0000 0000 0000 0038 0f00 004d 4554  .........8...MET
 00017f0: 412d 494e 462f 6d61 6e69 6665 7374 2e78  A-INF/manifest.x
-0001800: 6d6c 504b 0102 0000 1400 0000 0800 0693  mlPK............
+0001800: 6d6c 504b 0102 0000 1400 0000 0800 4493  mlPK..........D.
 0001810: e23a 75e5 4403 7003 0000 6f0e 0000 0b00  .:u.D.p...o.....
 0001820: 0000 0000 0000 0000 0000 0000 b410 0000  ................
 0001830: 636f 6e74 656e 742e 786d 6c50 4b05 0600  content.xmlPK...
 [2009-07-02 22:52 UTC] rasmus@php.net
I don't know this code at all, but just poking around in CVS I would say the most likely change that is able to cause those differences is this one:

http://cvs.php.net/viewvc.cgi/php-src/ext/zip/lib/zip_close.c?r1=1.1.2.9&r2=1.1.2.12&sortby=date




 [2009-07-03 02:33 UTC] dani dot church at gmail dot com
You're absolutely right, that's the file with the bug.  Both PHP 5.2.6 and 5.2.8+ produce malformed ZIP files, in slightly different ways, though only when UPDATING (not creating from scratch) ZIP files.  The zip library does not write optional data descriptors, even if the input file has them.  In 5.2.6, the "data descriptor exists" flag is set, even though there is no data descriptor.  In 5.2.8+, the flag is (properly) cleared in the local file header, but not in the central directory.  OpenOffice.org tolerates the bug in 5.2.6 but not the one in 5.2.8+.  A patch to fix this bug (and another minor bug I found in the same area) follows:

--- ext/zip/lib/zip_close.c.orig        2009-07-02 21:40:03.000000000 -0400
+++ ext/zip/lib/zip_close.c     2009-07-02 22:24:54.000000000 -0400
@@ -175,6 +175,7 @@
                    de.filename = strdup("-");
                    de.filename_len = 1;
                    cd->entry[j].filename = "-";
+                   cd->entry[j].filename_len = 1;
                }
                else {
                    de.filename = strdup(za->cdir->entry[i].filename);
@@ -195,13 +196,14 @@
                error = 1;
                break;
            }
+           memcpy(cd->entry+j, za->cdir->entry+i, sizeof(cd->entry[j]));
            if (de.bitflags & ZIP_GPBF_DATA_DESCRIPTOR) {
                de.crc = za->cdir->entry[i].crc;
                de.comp_size = za->cdir->entry[i].comp_size;
                de.uncomp_size = za->cdir->entry[i].uncomp_size;
                de.bitflags &= ~ZIP_GPBF_DATA_DESCRIPTOR;
+               cd->entry[j].bitflags &= ~ZIP_GPBF_DATA_DESCRIPTOR;
                }
-           memcpy(cd->entry+j, za->cdir->entry+i, sizeof(cd->entry[j]));
        }
 
        if (za->entry[i].ch_filename) {
 [2009-07-03 17:35 UTC] pajoye@php.net
Ah thanks guys :)

Do you have test cases for these two bugs (this one and the other you found while writing the patch)? It will help to valid the patch before I apply it. If you can post a link to the patch and the tests.

Thanks for your work!
 [2009-07-04 00:53 UTC] dani dot church at gmail dot com
The patch, a PHP testbed, and a test ZIP file (empty.zip) can all be found at http://dchurch.ath.cx/phpzip/.  The test ZIP is minimal and contains one empty file that uses a data descriptor.  The PHP testbed takes this ZIP file, sets the modified flag by adding and removing a dummy file, and writes the results back to the browser.  The ZIP file that PHP writes back to the browser is identical to the input file with the following exceptions:

1) The data descriptor, addresses 0x23-0x32 in the original file, is missing.  The central directory starts at 0x33 in the original file, and at 0x23 in the modified file.
2) The central directory address, stored at 0x76 in the original file and 0x66 in the modified file, is updated from 0x33 to 0x23.
3) The local file header contains the flag 0x08 at address 0x06 to indicate that a data descriptor is present.  This flag is cleared.
4) The central directory file header contains the flag 0x08 at address 0x3b (corresponding to 0x2b in the modified file), which is a copy of the same flag at 0x06.  This flag SHOULD be cleared, but in the current CVS, it does not get cleared.  The patch clears this flag.

I don't have a test case for the other bug I found, since the if block at lines 173-185 seems to be something that isn't supposed to happen in the normal flow of execution.  At the very least, I can't figure out a way to get to that point with ch_filename == NULL.  However, if that block ever did get executed, it would result in a central directory entry with a listed filename length of 0 but the character "-" in the filename field-- again, an invalid ZIP file.
 [2009-07-04 08:29 UTC] RalfBecker at outdoor-training dot de
I can reproduce that bug with php5.2.9 under openSUSE11.0, thought I tried so far only oo3 *.odt files. It seems not to depend on the file, in fact I can not create a file, where I can replace content.xml with itself, without corrupting it.
Ralf
 [2009-07-04 14:37 UTC] dani dot church at gmail dot com
RalfBecker: In fact, one probable workaround, until this bug gets fixed, is to iterate through EVERY file in the ZipArchive, get the contents, and addFromString to put them back into the archive.  By overwriting every single file in the archive (with its own contents), you won't trigger the bug.
 [2009-07-19 16:37 UTC] pajoye@php.net
Thanks for your patch! I have applied it to  all branches and pecl. A pecl release will be done next week.

Please note that the patch has been applied upstream as well (libzip repo).

I will close the bug once the test is there too.
 [2009-11-05 11:31 UTC] levelak at post dot cz
This bug does not only affect OpenOffice, but also WinRar (I have 
version 3.80 under Windows).
The bug happens whenever a file with more than 255 chars is added via 
addFromString...

eg.:

$zip->addFromString("test.txt","asdjdjfdlksjdaf"); //OK
$zip->addFromString("test2.txt",str_repeat("A",256); //Corrupt archive

The issue is resolved by upgrading to 5.2.11, on 5.2.6 it also works 
with no problems.
 [2009-11-05 12:12 UTC] pajoye@php.net
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 [2009-11-05 12:13 UTC] pajoye@php.net
Fixed in latest releases too.
 [2010-02-01 20:19 UTC] pajoye@php.net
reopen and assign to me.
 [2010-02-01 21:12 UTC] headofsteam at gmx dot com
pajoye at php.net thanks for reopening.
Don't suppose there is a way to transfer in my comment from #50893 ?

Anyway, I have read through this report again and there may be a difference other than the OS and version.
This report seems to be highlighting a problem in ZipArchive::AddFromString() I have been using the IMO more useful ZipArchive::AddFile() could that be significant.
 [2010-04-28 16:43 UTC] pajoye@php.net
-Status: Assigned +Status: Feedback
 [2010-04-28 16:43 UTC] pajoye@php.net
Can you try using 5.3.2 or 5.2.13 please?
 [2010-05-11 12:26 UTC] l dot kneschke at metaways dot de
I just want to let you know that I can confirm that creating .ods files is 
working under windows. First it was not working for us under windows too.

The main problem is, that .ods expects the internal name of the file(the second 
parameter of addFile) must have unix filesystem separators. When using windows 
filesystem separators the .ods zip archive is broken, even though extracting the 
archive is working.

I'll also attach working example.

Lars
 [2010-05-11 12:35 UTC] l dot kneschke at metaways dot de
Hm. I can't attach the working example. I'll place it on 

http://lars.kneschke.de/downloads/testODS.tar.bz2

Lars
 [2010-05-11 12:41 UTC] pajoye@php.net
Ok, it makes sense now.

It is a good thing that you spotted this limitation in the ODS loader. We should add a note to the documentation.

However please keep in mind that zip entry names are pure strings. They are not supposed to be path as we can see them. Even the notion of directory (filesystem directories) is not a known entity for zip archives.

Danke :)
 [2010-05-11 12:41 UTC] pajoye@php.net
-Status: Feedback +Status: To be documented -Assigned To: pajoye +Assigned To:
 [2010-05-11 12:41 UTC] pajoye@php.net
Ok, it makes sense now.

It is a good thing that you spotted this limitation in the ODS loader. We should add a note to the documentation.

However please keep in mind that zip entry names are pure strings. They are not supposed to be path as we can see them. Even the notion of directory (filesystem directories) is not a known entity for zip archives.

Danke :)
 [2010-05-11 12:42 UTC] pajoye@php.net
-Assigned To: +Assigned To: pajoye
 [2015-04-15 14:09 UTC] cmb@php.net
-Type: Bug +Type: Documentation Problem
 [2016-07-03 05:55 UTC] pajoye@php.net
-Status: Assigned +Status: Feedback
 [2016-07-03 05:55 UTC] pajoye@php.net
Please try with recent 5.6 or later. As far I can tell libzip got a tweak for that.
 [2016-07-17 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 [2016-09-09 23:16 UTC] cmb@php.net
-Status: No Feedback +Status: Re-Opened -Assigned To: pajoye +Assigned To: cmb
 [2016-09-09 23:16 UTC] cmb@php.net
I just checked ::addFile(), ::addFromString() and ::addGlob() wrt.
backslashes in the filename; these are inserted verbatim in the
ZIP file under PHP 7.0.10 using libzip 1.1.2.

In section 4.4.17, the ZIP specification states[1]:

| All slashes MUST be forward slashes '/' as opposed to backwards
| slashes '\' for compatibility with Amiga and UNIX file systems
| etc.

This should be documented.

[1] <https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT>
 [2016-09-09 23:29 UTC] cmb@php.net
Automatic comment from SVN on behalf of cmb
Revision: http://svn.php.net/viewvc/?view=revision&amp;revision=340010
Log: Fix #48763: ZipArchive produces corrupt OpenOffice.org files
 [2016-09-09 23:29 UTC] cmb@php.net
-Status: Re-Opened +Status: Closed
 [2020-02-07 06:06 UTC] phpdocbot@php.net
Automatic comment on behalf of cmb
Revision: http://git.php.net/?p=doc/en.git;a=commit;h=366509eacd25fd083006e411d553b5cb178db38d
Log: Fix #48763: ZipArchive produces corrupt OpenOffice.org files
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Dec 30 14:01:28 2024 UTC