php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72200 ZipArchive::open() creates invalidly encoded filenames
Submitted: 2016-05-11 16:30 UTC Modified: 2020-03-10 12:05 UTC
Votes:10
Avg. Score:5.0 ± 0.0
Reproduced:9 of 9 (100.0%)
Same Version:7 (77.8%)
Same OS:6 (66.7%)
From: thomas dot kuhn dot berlin at gmail dot com Assigned: cmb (profile)
Status: Closed Package: Zip Related
PHP Version: 7.0.6 OS: Windows
Private report: No CVE-ID: None
 [2016-05-11 16:30 UTC] thomas dot kuhn dot berlin at gmail dot com
Description:
------------
New in PHP7 on Windows: When filenames contain non-us-ascii-letters like öäü or asian chars etc., ZipArchive creates filenames in a way that those files are not accessible via PHP because the encoding is wrong.

testscript output PHP 5.5.35:
	php default_charset: UTF-8
	bool(true)

testscript output PHP 7.0.6:
	php default_charset: UTF-8
	bool(false)

run on Windows Server 2012 R2 and Windows7


Test script:
---------------
print "php default_charset: ".ini_get('default_charset')."\n"; // just 4 info (UTF-8)

$filename = "bugtest_müller-lüdenscheid.zip"; // just an example
$filename = utf8_encode($filename);	// simulating my database delivering utf8-string

$zip = new ZipArchive();
if( $zip->open($filename, ZipArchive::CREATE | ZipArchive::OVERWRITE) === true )
{
	$zip->addFile('bugtest.php', 'bugtest.php'); // copy of script file itself
	$zip->close();
}

var_dump( is_file($filename) );  // delivers ?





Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-05-17 12:32 UTC] cmb@php.net
-Status: Open +Status: Feedback
 [2016-05-17 12:32 UTC] cmb@php.net
Using the following slightly modified script, I can't reproduce
the issue with PHP 7.0.6 (VC14 x86 Non Thread Safe (2016-Apr-29
00:38:17)) on Windows 10:

    <?php
    
    $filename = "m\xc3\xbcller";
    $zip = new ZipArchive;
    var_dump($zip->open($filename,
            ZipArchive::CREATE|ZipArchive::OVERWRITE));
    var_dump($zip->addFromString('foo', 'bar'));
    var_dump($zip->close());
    var_dump(file_exists($filename));
    
Output

    bool(true)
    bool(true)
    bool(true)
    bool(true)

What results do you get running this script?
 [2016-05-17 12:40 UTC] thomas dot kuhn dot berlin at gmail dot com
-Status: Feedback +Status: Open
 [2016-05-17 12:40 UTC] thomas dot kuhn dot berlin at gmail dot com
@cmb: you missed the point: When filenames contain !!non-us-ascii-letters!! like !!öäü!! or !!asian chars!! etc., ZipArchive creates filenames in a way that those files are not accessible via PHP because the encoding is wrong. 

and btw !!adding!! files to an archive containing such letters in their filename fails as well. i have to copy them to to tmp-files first :(
 [2016-05-17 13:34 UTC] cmb@php.net
> @cmb: you missed the point: When filenames contain
> !!non-us-ascii-letters!! like !!öäü!! or !!asian chars!! etc.,

Yes, I understood this point. The filename in my script is
actually `müller` encoded as UTF-8. I just wanted to make sure,
that we're really dealing with UTF-8, what's not necessarily the
case with your script (consider it was stored encoded as something
else than ISO-Latin-1; then utf8_encode() might fail). Furthermore
I wanted to make the script more portable and generic so it could
serve as basis for a .phpt. And it might be useful to check the
return value of $zip->close(), too.
 [2016-05-17 15:57 UTC] thomas dot kuhn dot berlin at gmail dot com
@cmb: oh, sorry, you are right, i apologize :) i copyNpasted your script and executed it on my system, the output was as follows:

bool(true)
bool(true)
bool(true)
bool(false)

Windows NT 6.1 build 7601 (Windows 7 Professional Edition Service Pack 1) AMD64 
PHP/7.0.6 
Compiler: MSVC14 (Visual C++ 2015)
Architecture: x64
 [2016-05-18 10:56 UTC] cmb@php.net
-Status: Open +Status: Verified -Operating System: Windows Server 2012 R2 & Win7 +Operating System: Windows
 [2016-05-18 10:56 UTC] cmb@php.net
Thanks for testing the script. But actually, I have to apologize,
because I didn't carefully check my test results which were wrong
as I did run the script with PHP 5.6 first, which created the zip
file as expected, and later running with PHP 7.0.6 found the zip
created by the earlier run.

So, I can reproduce the issue with PHP 7.0.6 (both x86 and x64).
While file_put_contents("m\xc3\xbcller", …) creates the file
`müller` as expected, ZipArchive creates the file `müller`.
Apparently, ZipArchive maps filenames from UTF-8 to UTF-16/UCS-2,
while the plain file functions do not. That might be caused by
recent builds of the bundled libzip using the *W*idechar instead
of the *A*nsi variants of the WinAPI.
 [2016-05-19 12:16 UTC] thomas dot kuhn dot berlin at gmail dot com
@cmb: great that you noticed that mistake and that you can verify the problem. thank you for your time! your suggestions sound plosible to me but i have no real insight and therefor can't participate in discussing the matter much. but it seems evident that ZipArchive behaves differently then "the rest" of PHP.
 [2020-03-10 12:05 UTC] cmb@php.net
-Status: Verified +Status: Closed -Assigned To: +Assigned To: cmb
 [2020-03-10 12:05 UTC] cmb@php.net
This issue is supposed to be resolved as of PHP 7.1.0, which added
support for long and UTF-8 paths on Windows[1].

[1] <https://www.php.net/manual/en/migration71.windows-support.php#migration71.windows-support.long-and-utf8-path>
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Nov 23 07:01:29 2024 UTC