php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #74619 ZipArchive::ExtractTo cuts off file names
Submitted: 2017-05-21 04:37 UTC Modified: 2018-02-14 16:27 UTC
Votes:5
Avg. Score:4.6 ± 0.5
Reproduced:5 of 5 (100.0%)
Same Version:4 (80.0%)
Same OS:3 (60.0%)
From: megaone at yandex dot ru Assigned:
Status: Open Package: Zip Related
PHP Version: Irrelevant OS: Ubuntu, Debian
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: megaone at yandex dot ru
New email:
PHP Version: OS:

 

 [2017-05-21 04:37 UTC] megaone at yandex dot ru
Description:
------------
ExtractTo cuts off non latin file names, no matter of codepage, til first space, so 'имя файла.txt' becomes ' файла.txt'. Filenames with no spaces gets cut off at all to extension only - '.txt'. getNameIndex returns full file names.


Also ZipArchive relies on default_charset. When default_charset is different from archive charset, files will be unpacked with broken names.


Problem appears on Ubuntu 12.04 with PHP 7.1.4 and with Debian 9 with PHP 7.0.16. First problem does not appear on Windows.






Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-05-25 15:34 UTC] megaone at yandex dot ru
I've found that basename() does exactly the same, so it's seem to be related.
 [2017-12-14 10:16 UTC] 937787082 at qq dot com
I have the same problem.I can use "unzip -O cp936" get the filename.The filenam is wrong when I use ZipArchive to get the file.
PHP 7.0.23 with Windows
 [2018-02-14 16:27 UTC] cmb@php.net
-Package: zip +Package: Zip Related
 [2023-04-05 08:50 UTC] ft1r1l at inf dot elte dot hu
I cannot reproduce the codepage-independency of this bug. Using Debian Bullseye (PHP 7.4.33):

$ unzip -l zip.zip 
Archive:  zip.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2023-04-05 10:25   áéíuó.txt
        0  2023-04-05 10:34   тест тест.txt
---------                     -------
        0                     2 files
$ cat zip.php 
<?php
$zip = new ZipArchive;
$zip->open('zip.zip');
$zip->extractTo('output/');


An ASCII locale is indeed broken:

$ LANG=C php zip.php
$ ls output/
 uó.txt  ' тест.txt'


However, a UTF-8 locale works as expected:

$ LANG=en_US.UTF-8 php zip.php
$ ls output/
 áéíuó.txt  'тест тест.txt'
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 13:01:28 2024 UTC