php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #62966 Random numbers prediction
Submitted: 2012-08-29 12:30 UTC Modified: 2014-10-26 15:57 UTC
From: ymaryshev at ptsecurity dot ru Assigned: salathe (profile)
Status: Closed Package: *General Issues
PHP Version: Irrelevant OS: All
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: ymaryshev at ptsecurity dot ru
New email:
PHP Version: OS:

 

 [2012-08-29 12:30 UTC] ymaryshev at ptsecurity dot ru
Description:
------------
The specialists of the Positive Research center has discovered vulnerability in 
PHP's implementation of Linear Congruential Generator seeding:

ext/standard/lcg.c:

LCG(s1) = tv.tv_sec ^ (tv.tv_usec<<11);
...
#ifdef ZTS
    LCG(s2) = (long) tsrm_thread_id();
#else
    LCG(s2) = (long) getpid();
#endif
...
LCG(s2) ^= (tv.tv_usec<<11);

The implemented seeding is weak because firslty the value of tv.tv_sec is known 
to the attacker(for example through the Date HTTP-header), the time interval 
between two time measurements (tv.tv_usec) on most systems does not exceed 5 
microseconds, and finally process id or thread id on most systems present only 
32768 different values.
Therefore an attacker is able to bruteforce the two seeds of LCG generator 
having obtained a random value generated with it, and thus to predict all the 
future random values. The internal function php_combined_lcg() is used in:
- lcg_value() which is its wrapper
- uniqid() with more_entropy argument set to true
- in PHPSESSID generation
- in the GENERATE_SEED function which generates seed for rand() and mt_rand()

PHPSESSID can be effectively bruteforced for seeds of LCG generator as was 
described in:
http://crypto.di.uoa.gr/CRYPTO.SEC/Randomness_Attacks_files/paper.pdf
http://blog.ptsecurity.com/2012/08/not-so-random-numbers-take-two.html

As for GENERATE_SEED, the code is:

ext/standard/php_rand.h:

#ifdef PHP_WIN32
#define GENERATE_SEED() (((long) (time(0) * GetCurrentProcessId())) ^ ((long) 
(1000000.0 * php_combined_lcg(TSRMLS_C))))
#else
#define GENERATE_SEED() (((long) (time(0) * getpid())) ^ ((long) (1000000.0 * 
php_combined_lcg(TSRMLS_C))))
#endif

If an attacker manages to obtain a random number generated by rand() or 
mt_rand() he is able to bruteforce seed (for example using precomputed rainbow 
tables) and subsequently to  recover the seeds of LCG generator even without 
knowing any random values generated with LCG.

Our specialists has created a program to bruteforce seeds of LCG generator given 
the seed of rand() or mt_rand(). The bruteforce of full range of microseconds 
(0-999999), complete process IDs range (0-32768), and two deltas (both 0-5) can 
be performed in less than 10 minutes on Nvidia GT540M.


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-08-29 14:34 UTC] pajoye@php.net
FYI, we don't use rand nor mt_rand anymore as default entropy source but 
/dev/urandom or cryptgen's random src on windows. Alternative sources can be set 
using an ini setting.
 [2012-08-30 09:17 UTC] ymaryshev at ptsecurity dot ru
We agree on this, but the bruteforce of PHPSESSID in our advisory was meant for 
5.3.*, sorry for not clarifying this. Our principle point is that LCG seeding, 
both on 5.4.* and 5.3.*, is not secure enough, as it allows to easily bruteforce 
the seed having a seed produced by the GENERATE_SEED function. We are convinced 
that this is vulnerability in PHP itself, as the seeding of the pseudo random 
number generators, namely LCG and Mersenne Twister, is correlated. Thus if an 
attacker is able to recover the seed of one of them, he is able to obtain the 
seed of the other and thus to predict the random numbers of both.
Let us prove this by examining the vulnerability in a popular web application 
that we have discovered recently and which is possible due to the insecure 
seeding in PHP. The webapp leaks an mt_rand output through an md5 CSRF token. 
There is also the recovery password functionality which generates password reset 
token as follows:
sha1(uniqid(mt_rand(),1))
Having obtained the seed of the MT generator via its output in CSRF token, it is 
now possible to obtain a number of possible seeds S1 and S2 of the LCG generator 
as they are used in generation of the MT’s seed. The result cannot be a single 
pair S1 and S2 because some collisions are present:

https://storage.ptsecurity.ru/download.php?
hid=2c2a3187abdd60615144da5ee7711c9ce558fe09e0fe867ed7de9e85b29e7eda


However by bruteforcing locally sha1 hash knowing a predicted mt_rand output, a 
set of predicted php_combined_lcg outputs, and having only 106 values of 
microseconds to guess, it is possible to obtain the correct php_combined_lcg 
value and the value of microseconds. Now, consider a case when an attacker sends 
three Keep-Alive requests to a freshly seeded process: firstly, to get an md5 
CSRF token, then to reset his own password, and finally, that of an 
administrator. Having bruteforced sha1 hash sent to an attacker’s email it is 
possible to bruteforce the microseconds in administrator’s sha1 token within a 
couple thousand requests considering the fact that the password reset for both 
accounts was requested in close succession.
We believe that a stronger seeding of LCG generator and the absence of 
correlation between PRNGs would have made this attack infeasible.
 [2012-08-30 12:04 UTC] pajoye@php.net
Sorry I may have been not clear about what we do.

The weakness of the old session id generations is well known and already a CVE 
id about it.

The fix for this CVE was to enforce php to use a reliable entropy source (read: 
not mt_rand or rand). Both 5.3 and 5.4 (and the master branch) have the new more 
secure implementation.

It is indeed still possible to brute force crack urandom somehow, but heh, 
that's not something that easy in any modern operation system (windows latest 
cryptogen APIs are crypto safe btw, even better than /dev/urandom, more like 
/dev /random).
 [2012-08-30 13:06 UTC] ymaryshev at ptsecurity dot ru
Actually the demonstrated example has nothing to do with PHPSESSID generation at 
all. The point we are trying to explain is the dependence of the GENERATE_SEED 
macro and the php_combined_lcg() function. Let us cite our previous message: “if 
an attacker is able to recover the seed of one of them (PRNG), he is able to 
obtain the seed of the other and thus to predict the random numbers of both”.  
In other words, if a web app is trying to add some additional entropy by 
employing different PRNGs present in PHP, namely LCG (though uniqid with 
more_entropy = true) and Mersenne Twister, there would be actually no added 
entropy as the seeding of Mersenne Twister (via GENERATE_SEED) is dependent on 
LCG (php_combined_lcg), and the seeding of the LCG is not strong enough. 
The developers of the mentioned popular web app obviously were not aware of this 
fact that is why we think that PHP should break the interdependence of the PRNGs 
or introduce stronger seeding in the lcg_seed() function.
 [2012-08-30 14:41 UTC] pajoye@php.net
hi!

I first thought you were also referring to the whole IDs generation, incl. for session.

How is uniqid supposed to be secure? It only generates unique ids. It was never meant as being a way to generate passwords or even worst, data that must be crypto safe. It is even clearly documented: http://www.php.net/uniqid

Or am I missing something?
 [2012-09-03 22:37 UTC] stas@php.net
There's a lot of details here but kind of hard to understand where exactly the supposed vulnerability lies and what are the consequences. There is a lot of details but I'm not sure how they are supposed to come together. Is it specific for the usage of uniquid() by specific application to generate password reset tokens or is it a generic problem in some function that is supposed to be secure but is not? If the latter, could you please identify:
1. which function or functionality it is 
2. what is the specific problem in that code, in your opinion
3. what is the code or the scenario that reproduces the security problem and what the nature of this problem is (e.g. easier guessing PHP session ID, etc.)

Right now I'm not sure how practical even the attack proposed on specific application is (as far as I can see it involves reversing random sha1 hashes, which I'm not sure is very practical) but more important I'm not sure how it is a generic PHP problem and not problem of specific application leaking too much information. If you prefer, you could write to security@php.net list and describe it there if it's more convenient than using bug tracker.
 [2012-09-04 17:55 UTC] ymaryshev at ptsecurity dot ru
-->important to highlight the arrows<--

>Is it specific for the usage of uniquid() by specific application to generate 
>password reset tokens or is it a generic problem in some function that is 
>supposed to be secure but is not?

This is a generic problem as password token generation involving uniqid 
and/or (mt_)rand can be predicted if an attacker manages to obtain one of the 
pseudo random values produced either by the LCG generator or any PRNG seeded 
with the GENERATE_SEED macro, for example Mersenne Twister.

1. which function or functionality it is 

ext/standard/php_rand.h: 

#ifdef PHP_WIN32
#define GENERATE_SEED() (((long) (time(0) * GetCurrentProcessId())) ^ ((long) 
(1000000.0 * -->php_combined_lcg<--(TSRMLS_C))))
#else
#define GENERATE_SEED() (((long) (time(0) * getpid())) ^ ((long) (1000000.0 * --
>php_combined_lcg<--(TSRMLS_C))))
#endif

ext/standard/lcg.c:

static void lcg_seed(TSRMLS_D) /* {{{ */
{
    struct timeval tv;

    if (gettimeofday(&tv, NULL) == 0) {
        LCG(s1) = -->tv.tv_sec<-- ^ (tv.tv_usec<<11);
    } else {
        LCG(s1) = 1;
    }
#ifdef ZTS
    LCG(s2) = (long) -->tsrm_thread_id<--();
#else
    LCG(s2) = (long) -->getpid<--();
#endif

    /* -->Add entropy to s2 by calling gettimeofday() again<-- */
    if (gettimeofday(&tv, NULL) == 0) {
        LCG(s2) ^= (tv.tv_usec<<11);
    }

    LCG(seeded) = 1;
}

2. what is the specific problem in that code, in your opinion

There are two problems:
1.	The first is in php_rand.h as the GENERATE_SEED macro uses an output of 
php_combined_lcg function to produce a seed that is meant to be secure. 
2.	The second problem is in lcg.c as the lcg_seed function does weak 
seeding for the LCG generator. The seeding is weak because:
a.	 the timestamp is known to the attacker
b.	the pid presents only 215 different values
c.	additional entropy via “calling gettimeofday() again” does not exceed on 
most systems 5 microseconds.

3. what is the code or the scenario that reproduces the security problem and 
what the nature of this problem is (e.g. easier guessing PHP session ID, etc.)

If a web app leaks either mt_rand/rand number or a value produced by 
uniqid(more_entropy=true)/lcg_value, its password reset token generation 
mechanism employing one of these functions or a combination of them can be 
predicted.
In the previous messages we demonstrated how it is possible to exploit a web app 
if it leaks a random number produced by mt_rand() and generates password reset 
token like this:
sha1(uniqid(mt_rand(),1))
Consider an attack scenario when a web app leaks a value produced by 
uniqid(more_entropy=true) and generates password reset token using mt_rand.
Let us assume that an attacker has obtained a uniqid(more_entropy=true) output:
5045cea645c8b-->3.57116457<--
The value marked in bold is produced by the LCG generator. With this value an 
attacker is able to recover the initial seeds as they are too weak. Having the 
seeds of the LCG generator he is now able to figure out the value produced by 
php_combined_lcg() that was used in the GENERATE_SEED macro. After this an 
attacker is able to bruteforce locally the seed of mt_rand assuming the fact 
that he has a token sent to an attacker’s email and subsequently to predict the 
administrator’s token.
These attack scenarios suppose that the attacker is able to spawn fresh 
processes with newly seeded PRNGs.

>As far as I can see it involves reversing random sha1 hashes, which I'm not 
>sure is very practical

We conducted a series of experiments and the attack turned out to be highly 
practical. In the previous messages we explained how one can obtain constituent 
parts of a string that is passed to sha1:
1. mt_rand which can be predicted
2. php_combined_lcg which can be narrowed to about 100-200 possible outputs 
3. only 215 values of pid
Reversing this sha1 hash is a matter of a few seconds. Another thing is the 
bruteforce of microseconds in the administrator’s password token, but in 
practice it took us several thousand requests to match it.
 [2012-09-04 20:08 UTC] pajoye@php.net
I really don't understand why this is a security problem.

rand, mt_rand and uniqid are well documented. It is clearly stated that they 
should not be used for password generations and the likes. It is well known that 
they are not safe in any manner, for the exact reasons you are describing here.

It sounds really a bit weird to me to mention sha1, uniqid and mt_rand together 
with security. Or am I missing something obvious?
 [2012-09-05 08:15 UTC] ymaryshev at ptsecurity dot ru
>rand, mt_rand and uniqid are well documented. It is clearly stated that they 
>should not be used for password generations and the likes.
We have checked the manual pages for rand, mt_rand and uniqid functions and only 
the latter has a security related notice.

>It is well known that they are not safe in any manner
Practically all the web apps rely on (mt_)rand in their token generation, 
apparently these security problems are not well known.

>It sounds really a bit weird to me to mention sha1, uniqid and mt_rand together 
>with security.
If all of these functions are admitted by PHP to be insecure this should be 
stated in the documentation. 

Our research of different web apps shows that the developers do not know the 
risks of using uniqid and (mt_)rand together. We really urge you to examine 
carefully the attack scenario that we described in earlier messages. The 
“more_entropy” argument of uniqid is misleading as it allows to predict the 
numbers generated by (mt_)rand. If this is an appropriate behavior of PHP then 
it should be mentioned in documentation, otherwise it should be fixed.
 [2012-09-05 09:43 UTC] pajoye@php.net
[2012-09-05 08:15 UTC] ymaryshev at ptsecurity dot ru [delete]
> We have checked the manual pages for rand, mt_rand and uniqid 
> functions and only the latter has a security related notice.

Ok, then let fix that by adding these notices in the rand functions too.

> Our research of different web apps shows that the developers 
> do not know the risks of using uniqid and (mt_)rand together.
> We really urge you to examine carefully the attack scenario 
> that we described in earlier messages. The “more_entropy” 
> argument of uniqid is misleading as it allows to predict 
> the numbers generated by (mt_)rand. If this is an appropriate
> behavior of PHP then it should be mentioned in documentation,
> otherwise it should be fixed.

It should be fixed anyway, not PHP but the applications. They actually should use openssl_random_pseudo_bytes 
or urandom (or the like) when available. Drupal or many other major applications already made this move when 
the 1st attacks was done using simple brute force prediction a couple of years ago.

If nobody objects, I will make this bug public by Monday, and mark as a documentation problem.
 [2012-09-05 09:43 UTC] pajoye@php.net
-Status: Open +Status: Analyzed
 [2012-09-06 10:00 UTC] ymaryshev at ptsecurity dot ru
We certainly do not object to it, but we still think that fixing this problem 
would be a more appropriate solution just for the sake of web apps which do not 
track recent changes in PHP documentation.
 [2012-09-19 14:21 UTC] tony2001@php.net
How do you propose to fix it in PHP?
"Stronger seeding" sounds a bit too general to me.
 [2012-09-20 10:45 UTC] ymaryshev at ptsecurity dot ru
By stronger seeding we mean:
1.	Use external sources of entropy as in case of PHPSESSID in newer PHP 
versions (php_win32_get_random_bytes, urandom, etc). In other words we suggest 
applying “session.entropy_file/entropy_length” to the seed generating functions, 
namely lcg_seed() and GENERATE_SEED()
2.	Do not use the output of the LCG for generating seed of (mt_)rand (in the 
GENERATE_SEED macro)
 [2012-11-28 13:44 UTC] ymaryshev at ptsecurity dot ru
-Operating System: win +Operating System: All -PHP Version: 5.4.6 +PHP Version: Irrelevant
 [2012-11-28 13:44 UTC] ymaryshev at ptsecurity dot ru
Fixed issue summary
 [2012-11-28 15:31 UTC] pajoye@php.net
Doc needs to be updated to add the security concerns about the rand and mt_rand 
set of functions.
 [2012-11-28 15:31 UTC] pajoye@php.net
-Type: Security +Type: Documentation Problem
 [2014-10-26 15:57 UTC] salathe@php.net
-Status: Analyzed +Status: Closed -Assigned To: +Assigned To: salathe
 [2014-10-26 15:57 UTC] salathe@php.net
Fixed in r328689 by aharvey (2012-12-06).
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Nov 22 17:01:31 2024 UTC