Bug #41157 nl_langinfo() implementation conflicts with setlocale()
Submitted: 2007-04-21 21:13 UTC Modified: 2007-04-23 09:11 UTC
From: jo at feuersee dot de Assigned:
Status: Not a bug Package: I18N and L10N related
PHP Version: 5.2.1 OS: Linux
Private report: No CVE-ID: None
jo at feuersee dot de
 [2007-04-21 21:13 UTC] jo at feuersee dot de
nl_langinfo() seems to expect that the locale has been set in the format
ll_CC (ll being the ISO 639-1 language code and CC the ISO 3166-2 country code). All other locales (like only setting a language like 'de' or adding an encoding like 'ja_JP.UTF-8') do result in  output which looks like locale 'C'.

This leaves nl_langinfo() unuseable, since for languages with multiple possible encodings it is necessary to explicitly set the encoding in setlocale() (eg. to get the strftime() format strings in a useable defined encoding).

I compared the PHP results with the plain C API results and the restrictions do not appear in the C version. Thus I say the PHP implementation of nl_langinfo() is buggy.

To compare test4 with the C equivalent, here is the code:

#include <stdio.h>
#include <locale.h>
#include <langinfo.h>

// char buffer[1024];
char *buffer;

int main (void)
	buffer = setlocale(LC_ALL, "ja_JP.UTF-8");
	printf("Locale: %s\n", buffer);

	printf("%s\n", nl_langinfo(D_T_FMT));

Reproduce code:
(All examples are supposed to be typed into the shell):

test1 ~> php -r 'setlocale(LC_ALL, 'C'); printf("%d: %s\n", D_T_FMT, nl_langinfo('D_T_FMT'));'

test2 ~> php -r 'setlocale(LC_ALL, 'ja'); printf("%d: %s\n", D_T_FMT, nl_langinfo('D_T_FMT'));'

test3 ~> php -r 'setlocale(LC_ALL, 'ja_JP'); printf("%d: %s\n", D_T_FMT, nl_langinfo('D_T_FMT'));'

test4> php -r 'setlocale(LC_ALL, 'ja_JP.UTF-8'); printf("%d: %s\n", D_T_FMT, nl_langinfo(D_T_FMT));'

Expected result:
131112: %a %b %e %H:%M:%S %Y

131112: %a %b %e %H:%M:%S %Y

test3: (output of C code)
131112: %Y&#495;%m&#65533;d&#65533; %H%M&#684;%S&#65533;

test4: (C code gives propert UTF-8 output)
%Y&#24180;%m&#26376;%d&#26085; %H&#26178;%M&#20998;%S&#31186;

Actual result:
test1: passed

test2: fallback to locale C (passed)

test3: (output is in undefined encoding like C code, passed)
131112: %Y&#495;%m&#65533;d&#65533; %H%M&#684;%S&#65533;

test4: (fallback to C locale, _not_ passed):
131112: %a %b %e %H:%M:%S %Y


 [2007-04-23 09:11 UTC]
Change single quotes to double quotes in the single quoted string.

php -r 'setlocale(LC_ALL, 'ja_JP.UTF-8'); printf("%d: %s\n", D_T_FMT, nl_langinfo(D_T_FMT));'


php -r 'setlocale(LC_ALL, "ja_JP.UTF-8"); printf("%d: %s\n",
D_T_FMT, nl_langinfo(D_T_FMT));'

And enable display_errors & error_reporting when debugging.
