php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #53251 bindtextdomain with null directory doesn't return the previously set
Submitted: 2010-11-07 12:09 UTC Modified: 2010-11-26 21:03 UTC
Votes:6
Avg. Score:4.5 ± 0.5
Reproduced:5 of 5 (100.0%)
Same Version:3 (60.0%)
Same OS:0 (0.0%)
From: jeanseb at au-fil-du dot net Assigned: pajoye
Status: Assigned Package: Gettext related
PHP Version: 5.3.3 OS: Debian 5.0.6
Private report: No CVE-ID:
Have you experienced this issue?
Rate the importance of this bug to you:

 [2010-11-07 12:09 UTC] jeanseb at au-fil-du dot net
Description:
------------
man bindtextdomain : 
If dirname is NULL, the function returns the previously set base directory for domain domainname.

In PHP we are returning the CWD.

I have attached a patch with 2 tests. 

ext/gettext/tests/gettext_bindtextdomain-nulldir-alreadyset.phpt
=> PASS with my patch
ext/gettext/tests/gettext_bindtextdomain-nulldir.phpt
=> Fail, my patch introduce a BC. I'm not sure we want this but I don't see any workarround.

Test script:
---------------
<?php

echo 'getcwd() : ' . getcwd() . PHP_EOL;

echo 'bindtextdomain("messages", "./locale") : ' . bindtextdomain("messages", "./locale") . PHP_EOL;

echo 'bindtextdomain("messages",NULL) : ' . bindtextdomain("messages",NULL) . PHP_EOL;

Expected result:
----------------
getcwd() : /home/jeanseb
bindtextdomain("messages", "./locale") : /home/jeanseb/locale
bindtextdomain("messages",NULL) : /home/jeanseb/locale


Actual result:
--------------
getcwd() : /home/jeanseb
bindtextdomain("messages", "./locale") : /home/jeanseb/locale
bindtextdomain("messages",NULL) : /home/jeanseb

Patches

fix_broken_bindtextdomain (last revision 2010-11-26 17:09 UTC) by greno at verizon dot net)
Test_if_NULL_or_empty (last revision 2010-11-24 22:38 UTC) by pajoye@php.net)
gettext.patch (last revision 2010-11-07 11:49 UTC) by jeanseb at au-fil-du dot net)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-11-07 15:12 UTC] greno at verizon dot net
The previous expected PHP results shown are not correct for 'bindtextdomain'.

The expected results should be:
getcwd() : /home/jeanseb
bindtextdomain("messages", "./locale") : ./locale
bindtextdomain("messages",NULL) : ./locale

When I test using python 'bindtextdomain' I get the correct result:
# cat test.py
#!/usr/bin/env python
from gettext import *

print bindtextdomain("messages", "./locale");
print bindtextdomain("messages",None);

# python test.py
./locale
./locale

Relative base directories are allowed.
 [2010-11-07 15:52 UTC] greno at verizon dot net
And to clarify regarding the case where no previous bindtextdomain call was made prior to the bindtextdomain call with a NULL directory argument.  

In this case bindtextdomain should return the default locale directory for the system.  In the case of Linux this is '/usr/share/locale'.

I tested this again using python and it produces the correct result:
# cat test2.py
#!/usr/bin/env python

from gettext import *

### no previous bindtextdomain call in effect
print bindtextdomain("messages",None);

# python test2.py
/usr/share/locale
 [2010-11-18 21:59 UTC] greno at verizon dot net
Please fix this bug for 5.2.15 as well as 5.3.3.

This flaw prevents being able to query the existing directory setting for the domain and in turn this destroys any ability to nest usage of different translation stores.


.
 [2010-11-18 22:31 UTC] jeanseb at au-fil-du dot net
I haven't svn access yet and I prefer that someone with more background review this patch because it introduce an bc break.

For the relative path, I don't see any benefits to change this behaviour. Can you produce a real use case. Thank's
 [2010-11-18 23:10 UTC] greno at verizon dot net
It's not whether there is a use case or not, that's how bindtextdomain is defined to work.

The PHP implementation needs to work exactly like the GNU implementation and other implementations in other languages such as python for example, and right now it doesn't.  

That's why I pointed out the fact that relative directories are valid and in the case where a call is made with a NULL directory argument but no previous setting was made, that the correct behavior is to return the default locale directory for the system.
 [2010-11-24 00:28 UTC] pajoye@php.net
hi,

About the relative directories, what's about making them absolute before passing them to gettext? The problem here is that the script CWD may not be the library (apache or whatever else) CWD, that's the typical usage of the virtual CWD.

Looking at the current implementation:

- has '0' any special meaning for bindtextdomain?
  if not and if it is about testing empty string, this code is wrong and ugly :)

- If an empty string is given, passing VCWD_GETCWD results should be the right thing to do

- the current code has a leak, dir_name should be freed if VCWD_GETCWD called succeeded

About including it in 5.3.4, it is too late. However trunk is open, so please try to agree on the behavior(s) and update the patch accordingly.
 [2010-11-24 05:50 UTC] greno at verizon dot net
The current working directory (CWD) has nothing to do with the behavior of bindtextdomain except in the sense of a point from which relative paths will start.  Any relative path must be stored as relative so that if the CWD does change that the bind is then relative to the new working directory.

The behavior must be exactly like I pointed out previously.  Please read through my previous posts carefully and try the examples shown from python.  Or for that matter run a test using straight GNU bindtextdomain and you will see the proper behavior is as I have described.

And this needs to get into 5.2.15 and 5.3.4 as soon as possible.


.
 [2010-11-24 09:48 UTC] pajoye@php.net
I read your last comments and the behavior was not clear.

It is still not clear to me that PHP's binding should map one to one to gettext behavior for this exact feature. It is exactly the opposite of what other PHP APIs do and will create confusions to our users.

Also you misunderstood my point. In apache or other TS SAPIs the gettext library has no way to know the script CWD. But they will get the server CWD which is most likely either not readable (permission) or really not what the users expect to be used.

As of getting this change in 5.3.4 or 5.2.15, it is too late. We are already in RC phases.
 [2010-11-24 17:23 UTC] greno at verizon dot net
Fixing the broken bindtextdomain behavior is not going to create any confusion for any user.  

In fact, right now I can simulate the correct behavior by having an app store/restore the domain directory mapping from a separate tracking mechanism and all of gettext including bindtextdomain in PHP then works perfectly as it should.  

The issue is that most users would be expecting bindtextdomain to work correctly so that they could retrieve the current directory mapping using a NULL argument without having to resort to creating their own separate tracking mechanisms for keeping track of that state.

When I create a small PHP file holding something like:
bindtextdomain("messages", "./locale");
textdomain("messages");

And then I import that file into my files such as:
/index.php
/subapp1/index.php
/subapp2/index.php

It means that each of the directories ( /, /subapp1, /subapp2 ) are expected to have their own 'locale' directories.  This way all translations related to /subapp1 are stored in /subapp1/locale/...,  those for /subapp2 are stored in /subapp2/locale/...

Because the binding is relative you do not need to be resetting and manipulating the domain directory binding all the time to achieve translations using a local locale directory.

There should be no difference between the PHP implementation of bindtextdomain and the GNU implementation(C) which is also the same as is found in Java, Python, and other languages.  If you have any doubt go test these languages.

// HERE IS THE BEHAVIOR THAT SHOULD OCCUR:
// no mapping in effect yet
bindtextdomain("messages", NULL); // should return default system locale directory, eg: "/usr/share/locale" on Linux
// now we create a relative mapping
bindtextdomain("messages", "./locale"); // set a relative domain directory mapping
bindtextdomain("messages", NULL); // should return "./locale"
// now we create an absolute mapping
bindtextdomain("messages", "/absolutepath/locale"); // set an absolute domain directory mapping
bindtextdomain("messages", NULL); // should return "/absolutepath/locale"

And this is exactly the behavior of the other languages and of the GNU (C) implementation.  And PHP needs needs to fix its broken behavior.


As far as getting this fix in the RC's for 5.2.15 and 5.3.4.  This change fixes something that is badly broken and it warrants an exception.


.
 [2010-11-24 17:43 UTC] greno at verizon dot net
In addition, consider the case where you need to temporarily remap the domain directory to another translation store.

// WORKING WITH MULTIPLE TRANSLATION STORES:
bindtextdomain("messages", "./locale");
textdomain("messages");
...
// work with local translation store
...
// now need to get some translations from a different translation store
savedmapping = bindtextdomain("messages", NULL);
bindtextdomain("messages", "/pathtodifferentstore/locale");
...
// temporarily work with different translation store
...
// now we need to restore previous mapping
bindtextdomain("messages", savedmapping);
...
// once again working with previous translation store
...

Right now with PHP you cannot work with any nested heirarchy of multiple translation stores because you have no idea what is the current domain directory mapping so you cannot save it and restore it later.


.
 [2010-11-24 17:47 UTC] pajoye@php.net
For my own sake, please take 2.5 min to understand my comment.

Questions:

- relative directories, relative to what? CWD? Then please double read my comment, check how PHP works (or any other languages working as module of a server)

At any point after the call, the CWD can be different. The CWD inside the gettext library (read: inside the gettext library) is not (read: is not) equal to the actual script CWD, which is a virtual CWD based on the directory where the script, unless the script changes using the chdir() PHP function.

I perfectly understand what you said about how it should work, but that does not help to resolve the actual problem. How in the world do you know which CWD will be used by the library?
 [2010-11-24 20:32 UTC] greno at verizon dot net
Yes, I read and perfectly understood your comment.

There is no problem with CWD when using gettext.  I've tested this many times.  

I prepared a little app that demonstrates the bindtextdomain problem.  There is a README.txt that explains in the zip.

I didn't see any way to attach a file to the bug so I'm taking the liberty of attaching it as a Patch but it is not really a patch.  It demonstrates the problem of multiple translation stores with the broken bindtextdomain.
 [2010-11-24 20:44 UTC] greno at verizon dot net
The patch upload apparently won't take a zip.

The PHP bindtextdomain brokenness demonstration app can be found here:  http://greno-misc.googlecode.com/files/testbindtextdomain.zip


.
 [2010-11-24 23:23 UTC] pajoye@php.net
Using your script: it happens exactly what I explained earlier. The library does not find the locale because its CWD is not the same as the script CWD (which is a virtual CWD, created by the engine and specific to the engine).

I suppose you use Apache in a non thread safe mode, that won't show you the effect I was referring to as it happens only with threaded SAPI.
 [2010-11-24 23:38 UTC] pajoye@php.net
The following patch has been added/updated:

Patch Name: Test_if_NULL_or_empty
Revision:   1290638321
URL:        http://bugs.php.net/patch-display.php?bug=53251&patch=Test_if_NULL_or_empty&revision=1290638321
 [2010-11-24 23:39 UTC] pajoye@php.net
-Status: Open +Status: Feedback
 [2010-11-24 23:39 UTC] pajoye@php.net
Besides our little difference about virtual CWD, please try the attached patch against 5.3. It checks if the given directory is empty or NULL (dir length == 0) and if yes, calls bindtextdomain with NULL.

Is it what you expect?
 [2010-11-24 23:43 UTC] greno at verizon dot net
Gettext is not thread-safe.

You cannot run PHP gettext in anything but a non-threaded CGI or prefork mode.

We run our webservers prefork to support PHP gettext.  

And for any production webserver it is best to run either CGI or prefork so that your connections are completely isolated.

And PHP gettext performs correctly when used w/CGI or preforking webservers.


.
 [2010-11-24 23:44 UTC] pajoye@php.net
It is TS on windows now (our builds). But the key part is in my question in the 2nd comment. Questions like to be answered from time to time you know :)
 [2010-11-24 23:51 UTC] greno at verizon dot net
I looked at the patch earlier.

It does not do what is necessary.

When the user calls bindtextdomain(domain, NULL) it needs to do one of two things.  If there was no previous bindtextdomain call made then it should return the default locale directory for the system.  Otherwise if bindtextdomain was called previously then it should return that setting.  

There are about 3-4 places above where I've outlined the correct behavior in detail.  Can you please tailor the behavior to match those?  Thanks.

And who cares about dir length = 0?  What does that have to do with returning the correct setting?  You do not even need to be looking at any directory length.  If you see a NULL there you go, that's the key that tells you to return existing setting.


.
 [2010-11-25 00:00 UTC] pajoye@php.net
If you read the code you will see that dir_len == 0  happens when you pass NULL or "" to the function.

We don't do any detection or whatever else in there. This is a pure binding, there is no logic in the PHP extension and we only rely on the underlying library to do the right thing with what the users provide. The only thing that we did not do was not to pass NULL when the user actually gives us NULL (or empty string, can be improved later to actually detect both separately).

Now, reading the bindtextdomain API reference:

"Note: Applications that wish to use chdir() should always use absolute pathnames to avoid misadvertently selecting the wrong or non-existant directory."

That pretty much confirms what I've been saying earlier.

http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/baselib-bindtextdomain.html
 [2010-11-25 00:15 UTC] greno at verizon dot net
That warning is just to make sure that people understand that when you use relative domain dir setting that if you chdir() then the domain dir is going to follow as well since it is relative.  I mean it shouldn't even be necessary to give such a warning.  Only total newbies would not understand how relative and absolute would behave.


In many cases using relative setting is exactly what you want because you have a bunch of standalone apps with their own translation stores and you are just integrating as-is.  So relative mapping works very well in this case.

If you are building one humongous app and you want all translations from everywhere to be in one translation store then absolute mapping makes sense in that case.

The are good and valid use cases for both types of bindings.

bindtextdomain(domain, NULL) just needs to return whatever setting you gave to it previously.  If it was relative then you return the exact same relative setting (eg: ./locale ) and if it was absolute then you return the exact same absolute setting (eg: /absolutepath/locale ).




.
 [2010-11-25 00:44 UTC] pajoye@php.net
"That warning is just to make sure that people understand that when you use relative domain dir setting that if you chdir() then the domain dir is going to follow as well since it is relative"

It is not "just" a warning, it is exactly what I explained, php uses chdir before you even get the hand (pls keep in mind TS and NTS sapi).

"bindtextdomain(domain, NULL) just needs to return whatever setting you gave to it previously."

That's where it could be hard. However I could modify the patch (mines, not the other) to:

1. make a path absolute only when a TS SAPI is used (php builds in TS mode)
2. pass NULL when NULL is given, empty string may be considered as '.', for BC reasons

Doing so will create no wtf for the users and it will do what you are looking for.
 [2010-11-25 01:17 UTC] greno at verizon dot net
If it can pass the following tests on Linux then it should be Ok:

http://greno-misc.googlecode.com/files/testbindtextdomain.php


.
 [2010-11-25 01:19 UTC] pajoye@php.net
It would be very helpful if you can write it as phpt :)

See http://qa.php.net/write-test.php
 [2010-11-25 01:31 UTC] greno at verizon dot net
How do you run a phpt test from the command line?

.
 [2010-11-25 01:34 UTC] pajoye@php.net
make test TESTS=/path/to/php/ext/gettext/tests

from your build directory.

or you can run run-test.php manually, see the howto links for the details or php run-tests.php --help
 [2010-11-25 02:27 UTC] greno at verizon dot net
I think these phpt tests will work.  I ran them from the command line and they ran fine.  Had no idea what to name them so you may have to adjust.

http://greno-misc.googlecode.com/files/bindtextdomaintests.zip


.
 [2010-11-25 02:41 UTC] greno at verizon dot net
Found that Expect does not need quotes around strings.  Ignore previous tests.

Reposted updated tests:

http://greno-misc.googlecode.com/files/bindtextdomaintests-2.zip


.
 [2010-11-25 15:59 UTC] greno at verizon dot net
One important thing to note about the tests:

It is not necessary for the domain directory string set in bindtextdomain to actually exist in the file system at the time of binding.

A script could be dynamically creating this directory later on.  

The binding is to nothing more than a string that represents a potential directory path.  

It is up to the user to make sure that the string bound by bindtextdomain actually represents a valid path at time of execution of gettext commands.

I've tested this behavior in other languages and that is how they work.


.
 [2010-11-26 16:50 UTC] greno at verizon dot net
Looking at bindtextdomain gettext.c in the current 5_3 tree:

PHP_NAMED_FUNCTION(zif_bindtextdomain)
{
        char *domain, *dir;
        int domain_len, dir_len;
        char *retval, dir_name[MAXPATHLEN];

        if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ss", &domain, &domain_len, &dir, &dir_len) == FAILURE) {
                return;
        }

        PHP_GETTEXT_DOMAIN_LENGTH_CHECK

        if (domain[0] == '\0') {
                php_error(E_WARNING, "The first parameter of bindtextdomain must not be empty");
                RETURN_FALSE;
        }

        if (dir[0] != '\0' && strcmp(dir, "0")) {
                if (!VCWD_REALPATH(dir, dir_name)) {
                        RETURN_FALSE;
                }
        } else if (!VCWD_GETCWD(dir_name, MAXPATHLEN)) {
                RETURN_FALSE;
        }

        retval = bindtextdomain(domain, dir_name);

        RETURN_STRING(retval, 1);
}

==================================================
This logic in that function makes no sense with respect to bindtextdomain.

The function should ONLY need to do this:

PHP_NAMED_FUNCTION(zif_bindtextdomain)
{
        char *domain, *dir;
        int domain_len, dir_len;
        char *retval, dir_name[MAXPATHLEN];

        if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ss", &domain, &domain_len, &dir, &dir_len) == FAILURE) {
                return;
        }

        PHP_GETTEXT_DOMAIN_LENGTH_CHECK

        if (domain[0] == '\0') {
                php_error(E_WARNING, "The first parameter of bindtextdomain must not be empty");
                RETURN_FALSE;
        }

        retval = bindtextdomain(domain, dir_name);

        RETURN_STRING(retval, 1);
}

==================================================

All that should be necessary is to check that the first argument (domain) is not NULL and then just return the result of calling the underlying bindtextdomain.  Nothing else needs to be checked.  

All that is happening is that a string representing a potential directory path is being bound to a domain name OR a NULL directory arg is passed in which case the underlying bindtextdomain will return the previous setting.


Can this change PLEASE be made asap and put into 5.2.15 and 5.3.3?
It needs to get into 5.2.15 because that is the last rev from my understanding for the 5.2 releases and as this is currently being used almost everywhere right now it is important that the 5.2 release series have a fix for the broken bindtextdomain function.  The 5.3.3 is not as important but certainly would be nice to get this fix out quickly.


.
 [2010-11-26 18:17 UTC] pajoye@php.net
That's exaclty what I said in my comment earlier this week.
 [2010-11-26 18:17 UTC] pajoye@php.net
-Status: Feedback +Status: Assigned -Assigned To: +Assigned To: pajoye
 [2010-11-26 20:37 UTC] greno at verizon dot net
I've attached a patch (fix_broken_bindtextdomain) to the bug which should correct the bindtextdomain problem.  

And I prepared phpt tests (see my second set of phpt tests zip file in a previous posting) that can test the bindtextdomain NULL arg functionality.  

So everything is here and available to fix bindtextdomain in PHP 5.2.15 and 5.3.3.


.
 [2010-11-26 20:43 UTC] pajoye@php.net
Thanks for the patch. But it is sadly not correct. We have to get the realpath of the directory argument in TS mode.
 [2010-11-26 20:51 UTC] greno at verizon dot net
Gettext is not thread-safe.  You shouldn't need to worry about trying to get gettext working in threads.  The underlying gettext library would first have to be made thread-safe and no longer able to be driven by environment variables.

Gettext can be driven by environment variables and unless you have a way to set up completely different environments for each thread, then there is never going to be a way to reliably use gettext in threads.  One thread could set an env var to one value and another thread in the same process set the env var to a different value and both of these threads would see the value set by the latest thread.  And gettext would behave according to the latest setting which would totally screw up the earlier thread.


.
 [2010-11-26 21:03 UTC] pajoye@php.net
I really appreciate your feedback but it tends to be a monologue. You totally ignore any comment you disagree with or don't care about.

I explained already many times why and when we should care about that. I even pointed to the gettext documentation telling that. So let say that unless you have something new to say, the problem is identified and will be fixed asap.
 [2010-11-26 23:12 UTC] greno at verizon dot net
Please, I do read and consider all of your comments.

And I understand the concern about the TS build.  

I was merely trying to convey that I have not seen or read about any evidence that is convincing that using gettext in threads would be successful given the current state of the underlying GNU gettext library.

I based this on things like this:

Checking latest gettext manual here:
http://www.gnu.org/software/gettext/manual/gettext.html#PHP

It shows that PHP 'uses' not 'emulates' the GNU underlying gettext library.  The GNU library is not thread-safe.

Checking PHP gettext in the manual:
http://www.php.net/manual/en/gettext.requirements.php

It shows a comment stating that PHP gettext is not thread-safe and even gives an example of one of the environment variables that can drive the underlying GNU gettext library.  The comment supports other places of information that basically assert the same thing.



I certainly agree, if you need to put some type of ifdef TSMODE or whatever in the patch OK, please by all means put it in there.  That only would apply to the threaded model so it would be a NOP to those using the non-threaded model.  If you need to force the fact that the bound directory must exist for whatever reason in the threaded model, OK, that's some limitation of the threaded model. It deviates from the defined GNU bindtextdomain behavior but to get a threaded model working maybe that's a necessary tradeoff.


My only goal is to get a patch in PHP 5.2 and 5.3 asap to fix the current brokenness of bindtextdomain.


.
 [2011-09-26 23:55 UTC] tyrael@php.net
what is missing here to move forward with the fix?

Tyrael
 
PHP Copyright © 2001-2017 The PHP Group
All rights reserved.
Last updated: Tue Aug 29 15:01:52 2017 UTC