php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72200 ZipArchive::open() creates invalidly encoded filenames
Submitted: 2016-05-11 16:30 UTC Modified: 2020-03-10 12:05 UTC
Votes:10
Avg. Score:5.0 ± 0.0
Reproduced:9 of 9 (100.0%)
Same Version:7 (77.8%)
Same OS:6 (66.7%)
From: thomas dot kuhn dot berlin at gmail dot com Assigned: cmb (profile)
Status: Closed Package: Zip Related
PHP Version: 7.0.6 OS: Windows
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: thomas dot kuhn dot berlin at gmail dot com
New email:
PHP Version: OS:

 

 [2016-05-11 16:30 UTC] thomas dot kuhn dot berlin at gmail dot com
Description:
------------
New in PHP7 on Windows: When filenames contain non-us-ascii-letters like öäü or asian chars etc., ZipArchive creates filenames in a way that those files are not accessible via PHP because the encoding is wrong.

testscript output PHP 5.5.35:
	php default_charset: UTF-8
	bool(true)

testscript output PHP 7.0.6:
	php default_charset: UTF-8
	bool(false)

run on Windows Server 2012 R2 and Windows7


Test script:
---------------
print "php default_charset: ".ini_get('default_charset')."\n"; // just 4 info (UTF-8)

$filename = "bugtest_müller-lüdenscheid.zip"; // just an example
$filename = utf8_encode($filename);	// simulating my database delivering utf8-string

$zip = new ZipArchive();
if( $zip->open($filename, ZipArchive::CREATE | ZipArchive::OVERWRITE) === true )
{
	$zip->addFile('bugtest.php', 'bugtest.php'); // copy of script file itself
	$zip->close();
}

var_dump( is_file($filename) );  // delivers ?





Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-05-17 12:32 UTC] cmb@php.net
-Status: Open +Status: Feedback
 [2016-05-17 12:32 UTC] cmb@php.net
Using the following slightly modified script, I can't reproduce
the issue with PHP 7.0.6 (VC14 x86 Non Thread Safe (2016-Apr-29
00:38:17)) on Windows 10:

    <?php
    
    $filename = "m\xc3\xbcller";
    $zip = new ZipArchive;
    var_dump($zip->open($filename,
            ZipArchive::CREATE|ZipArchive::OVERWRITE));
    var_dump($zip->addFromString('foo', 'bar'));
    var_dump($zip->close());
    var_dump(file_exists($filename));
    
Output

    bool(true)
    bool(true)
    bool(true)
    bool(true)

What results do you get running this script?
 [2016-05-17 12:40 UTC] thomas dot kuhn dot berlin at gmail dot com
-Status: Feedback +Status: Open
 [2016-05-17 12:40 UTC] thomas dot kuhn dot berlin at gmail dot com
@cmb: you missed the point: When filenames contain !!non-us-ascii-letters!! like !!öäü!! or !!asian chars!! etc., ZipArchive creates filenames in a way that those files are not accessible via PHP because the encoding is wrong. 

and btw !!adding!! files to an archive containing such letters in their filename fails as well. i have to copy them to to tmp-files first :(
 [2016-05-17 13:34 UTC] cmb@php.net
> @cmb: you missed the point: When filenames contain
> !!non-us-ascii-letters!! like !!öäü!! or !!asian chars!! etc.,

Yes, I understood this point. The filename in my script is
actually `müller` encoded as UTF-8. I just wanted to make sure,
that we're really dealing with UTF-8, what's not necessarily the
case with your script (consider it was stored encoded as something
else than ISO-Latin-1; then utf8_encode() might fail). Furthermore
I wanted to make the script more portable and generic so it could
serve as basis for a .phpt. And it might be useful to check the
return value of $zip->close(), too.
 [2016-05-17 15:57 UTC] thomas dot kuhn dot berlin at gmail dot com
@cmb: oh, sorry, you are right, i apologize :) i copyNpasted your script and executed it on my system, the output was as follows:

bool(true)
bool(true)
bool(true)
bool(false)

Windows NT 6.1 build 7601 (Windows 7 Professional Edition Service Pack 1) AMD64 
PHP/7.0.6 
Compiler: MSVC14 (Visual C++ 2015)
Architecture: x64
 [2016-05-18 10:56 UTC] cmb@php.net
-Status: Open +Status: Verified -Operating System: Windows Server 2012 R2 & Win7 +Operating System: Windows
 [2016-05-18 10:56 UTC] cmb@php.net
Thanks for testing the script. But actually, I have to apologize,
because I didn't carefully check my test results which were wrong
as I did run the script with PHP 5.6 first, which created the zip
file as expected, and later running with PHP 7.0.6 found the zip
created by the earlier run.

So, I can reproduce the issue with PHP 7.0.6 (both x86 and x64).
While file_put_contents("m\xc3\xbcller", …) creates the file
`müller` as expected, ZipArchive creates the file `müller`.
Apparently, ZipArchive maps filenames from UTF-8 to UTF-16/UCS-2,
while the plain file functions do not. That might be caused by
recent builds of the bundled libzip using the *W*idechar instead
of the *A*nsi variants of the WinAPI.
 [2016-05-19 12:16 UTC] thomas dot kuhn dot berlin at gmail dot com
@cmb: great that you noticed that mistake and that you can verify the problem. thank you for your time! your suggestions sound plosible to me but i have no real insight and therefor can't participate in discussing the matter much. but it seems evident that ZipArchive behaves differently then "the rest" of PHP.
 [2020-03-10 12:05 UTC] cmb@php.net
-Status: Verified +Status: Closed -Assigned To: +Assigned To: cmb
 [2020-03-10 12:05 UTC] cmb@php.net
This issue is supposed to be resolved as of PHP 7.1.0, which added
support for long and UTF-8 paths on Windows[1].

[1] <https://www.php.net/manual/en/migration71.windows-support.php#migration71.windows-support.long-and-utf8-path>
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 15:01:30 2024 UTC