|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2005-01-28 12:52 UTC] arjan at avoid dot org
Description:
------------
fgetcsv on PHP5.0.3 has problems with reading CSV-fields that start with an umlaut character (possibly other 'weird' characters as well). It simply skips those characters.
PHP4.3.10 works fine.
Reproduce code:
---------------
csv_test.php:
<?php
$fp = fopen('csv_test.csv', 'r');
while($data = fgetcsv($fp, 2000, ';', '"')) {
var_dump($data);
}
fclose($fp);
?>
csv_test.csv:
language_name;country_name
Deutsch;?sterreich
Nederlands;Nederland
Deutsch;Deut?land
?nited Kingdom
Expected result:
----------------
array(2) {
[0]=>
string(13) "language_name"
[1]=>
string(12) "country_name"
}
array(2) {
[0]=>
string(7) "Deutsch"
[1]=>
string(9) "?sterreich"
}
array(2) {
[0]=>
string(10) "Nederlands"
[1]=>
string(9) "Nederland"
}
array(2) {
[0]=>
string(7) "Deutsch"
[1]=>
string(9) "Deut?land"
}
array(1) {
[0]=>
string(13) "?nited Kingdom"
}
Actual result:
--------------
array(2) {
[0]=>
string(13) "language_name"
[1]=>
string(12) "country_name"
}
array(2) {
[0]=>
string(7) "Deutsch"
[1]=>
string(9) "sterreich"
}
array(2) {
[0]=>
string(10) "Nederlands"
[1]=>
string(9) "Nederland"
}
array(2) {
[0]=>
string(7) "Deutsch"
[1]=>
string(9) "Deut?land"
}
array(1) {
[0]=>
string(13) "nited Kingdom"
}
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Nov 19 11:00:01 2025 UTC |
In order to narrow the problem down as much as I can, I tried the following script as well on the system that have problems with fgetcsv: <?php $fp = fopen('csv_test.csv', 'r'); while (!feof($fp)) { $buffer = fgets($fp, 4096); echo $buffer; } fclose($fp); ?> In this case, the umlauts do get read and printed.Again: the 1st-umlaut-vasnishs problem: (PHP version 5.2.6) Sorry, but for many of non-English web developers this "solution" is not helpful. As for those of us who are not a system administrator has the problem that we cannot influence the settings of PHP or Apache. So in my environment the "Safe mode" was switched on, preventing the usage of putenv. And I think it is an illusion to dispose some of the big providers to switch the safe mode off (even if this feature is DEPRECATED). And $_ENV ['LANG'] = 'en_US' does not healed the problem (nor setting de_DE). Nevertheless the environment variable LANG is not set (asking getenv, $_ENV, phpinfo). And I think this problem has nothing to do with the encoding: Other “inline” umlauts are preserved as estimated. If the data field is enclosed with quotes the 1st Umlaut after the introducing quote (therefore the 2nd character) survives. So I live now with the ugly workaround to place a magic sequence ~~ before every 1st umlaut in the csv file with: preg_replace ("/(\t)([€-ÿ])/", "\t~~$2", $import); and remove these sequences after fgetcsv while reading the field array with: foreach ( $columns as $col ) { $col = trim ($col, '~~'); … Max.