|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[2016-07-31 14:20 UTC] cmb@php.net
-Status: Open
+Status: Not a bug
-Assigned To:
+Assigned To: cmb
[2016-07-31 14:20 UTC] cmb@php.net
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Thu Oct 30 06:00:02 2025 UTC |
Description: ------------ In php 5.3.x when using fgetcsv to read a unicode file including a UTF-8 Byte Order Mark (BOM) prefix 0xEF,0xBB,0xBF the first row of the file is not read correctly. If the BOM is removed fgetcsv reads the file correctly. I have tried this with and without setlocale and the result is always wrong. I have run the same program on PHP 5.2.4 and it works. Test File is the simplest possible csv with the BOM prefix "a" followed by a newline contains (7 characters in total) 0xEF,0xBB,0xBF,0x22,0x61,0x22,0x0A When processed by fgetcsv the doublequotes should get removed and the value a should be in the array returned. Test script: --------------- <?php echo mb_detect_encoding(file_get_contents($argv[1]))."\n"; setlocale(LC_CTYPE, 'en_GB.utf8'); $handle = fopen($argv[1], "r"); $data = fgetcsv($handle, 1000, ","); print_r($data); ?> Expected result: ---------------- UTF-8 Array ( [0] => a ) Actual result: -------------- UTF-8 Array ( [0] => "a" )