|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #65815 ZipArchive reads filenames with UTF-8 characters wrong
Submitted: 2013-10-02 15:51 UTC Modified: 2015-05-05 14:55 UTC
Avg. Score:4.0 ± 0.8
Reproduced:6 of 6 (100.0%)
Same Version:1 (16.7%)
Same OS:1 (16.7%)
From: matti dot jarvinen at nitroid dot fi Assigned: cmb (profile)
Status: Closed Package: Zip Related
PHP Version: 5.4.20 OS: Fedora 3.8.6-203.fc18.x86_64
Private report: No CVE-ID: None
 [2013-10-02 15:51 UTC] matti dot jarvinen at nitroid dot fi
I have a valid Zip file created with Windows 8 and with iZarc containing filenames like 12-päivä.pdf, 13-päivä.pdf

ZipArchive reads filenames wrong.

At least getNameIndex and extractTo are affected.

Test script:
ini_set('default_charset', 'UTF-8');

$Zip = new ZipArchive();

$open = $Zip->open('');

$length = $Zip->numFiles;

for($i = 0; $i < $length; $i++)
  $importName = $Zip->getNameIndex($i);

  print $brokenImportName;


  // this is a specific workaround. Some characters are stuck in ASCII apparently
  //$fixedImportName = str_replace(chr(132),'ä',$brokenImportName);

  //print $fixedImportName;


Expected result:

Actual result:


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2013-10-03 10:31 UTC] matti dot jarvinen at nitroid dot fi
If zip file contains following files:

test3/Российская Федерация.PDF

ZipArchive will read them as:

test3/Российская Федерация.PDF

Broken file names can be changed to correct UTF-8 characters with:


// correct UTF-8 should hold together through this
if($filename === mb_convert_encoding(mb_convert_encoding($filename, "UTF-32", "UTF-8"), "UTF-8", "UTF-32"))
  $fixedFilename = $filename;
  // otherwise we should use 
  $fixedFilename = mb_convert_encoding($filename, 'UTF-8','CP850');


.ZIP File Format Specification Version: 6.3.3 APPENDIX D - Language Encoding (EFS) might hold the answers about reading file name encoding correctly from the zip file.

Codepage if not UTF-8 should be CP437 if I understood correctly from the specs, although that encoding is not supported in PHP. I got good results with CP850 but I cannot verify this with workaround with every character in CP850 and CP437.
 [2015-05-05 14:55 UTC]
-Status: Open +Status: Closed -Assigned To: +Assigned To: cmb
 [2015-05-05 14:55 UTC]
This issue is supposed to be fixed with libzip 0.11. As of PHP
5.6.0 libzip 0.11.2 or newer is bundled. For older versions PECL
provides up-to-date zip extension packages:
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Apr 17 18:01:28 2024 UTC