php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #50996 crypt function incorrectly describes how to use MD5/Blowfish
Submitted: 2010-02-10 17:26 UTC Modified: 2010-02-20 17:00 UTC
From: cscott at ggot dot org Assigned: joey (profile)
Status: Closed Package: Documentation problem
PHP Version: 5.3.1 OS: Irrelevant
Private report: No CVE-ID: None
 [2010-02-10 17:26 UTC] cscott at ggot dot org
Description:
------------
The documentation provided at http://php.net/crypt is extremely weak 
and sometimes inaccurate.  At the bottom there is a note to check your 
Unix man pages for more info, which would probably have been a 
perfectly acceptable response, were it not for the face that PHP added 
support for certain algorithms even if they were not present in the 
system's library, making this a universal PHP issue now instead of 
just a system/dependency issue.

I suggest the following (or something like it) be added to the PHP 
Manual with respect to the crypt function in order to make this 
valuable function and the changes made to its support more accessible.

Additional text that applies to all, and is implied but not made 
clear:
----
Since all salts are considered to have a fixed, maximum length, the 
result of a call to crypt() may be passed to subsequent calls to 
crypt() in the salt parameter as a method of ensuring the same salt is 
used for validation purposes.  Example:

crypt(p, crypt(p, s)) == crypt(p, s)
----
Current, Erroneous (or at least misleading) text:
----
CRYPT_MD5 - MD5 encryption with a twelve character salt starting with 
$1$
CRYPT_BLOWFISH - Blowfish encryption with a sixteen character salt 
starting with $2$ or $2a$

At least on Windows systems, which probably means we're using the 
built in PHP support for these two algorithms, they follow these 
rules:

CRYPT_MD5 - The salt follows this convention: '$1$SALTsalt$'; only the 
first 8 characters of any given salt will be considered, so hashes 
produced with the salt '$1$SALTsalt$' will be identical to ones 
produced with a salt such as '$1$SALTsaltSALTsalt$'.  [[Note: I 
originally thought this was because the string was being treated as a 
base64 entity with a custom alphabet, but testing reveals that any 
characters are valid in the salt of this algorithm]]

CRYPT_BLOWFISH - Salts are interpreted as, and hashes returned as, a 
base64 string; this string is based on the GNU C Library's base64 
alphabet for crypt (found here: 
http://www.gnu.org/s/libc/manual/html_node/crypt.html) but otherwise 
follows standard conventions of any base64 string.  The '$' character 
is considered a null, or padding character.

The salt follows this convention '$2a$##$SALTsaltSALTsaltSALTsa'; The 
beginning may of course be '$2$' or '$2a$'.  The next three characters 
should be a decimal number indicating how many times the key should be 
calculated (i.e., how expensive it is to calculate the key).  This 
number will be interpreted as a power of two (e.g., $2a$07$ would be 
interpreted as 2^7), and cannot be less than 3 or greater than 31 
[[Note: have not tested this on a 64-bit system, so it might be 
higher]].

The salt is composed of up to 22 characters from an alphabet ./0-9A-
Za-z with the $ symbol considered a null/end-of-string character.  
Salts longer than 22 characters will only have the first 22 characters 
of the salt considered, while salts shorter than 22 characters will be 
padded with the '$' character, though this has no effect.  
Additionally, any part of the salt following a '$' character will not 
be considered (as processing of the salt will have terminated upon 
first encounter of '$').  Any characters which are not a part of the 
above alphabet and occur before a string-termination character ('$') 
will cause crypt to fail, returning and empty string.

WARNING:{
Because the salt is interpreted as a base64 number, certain salts may 
potentially produce identical results.  This occurs when two salts are 
identical except for one character the second salt that does not exist 
in the first (e.g., 'abcd' and 'abcde'), and the shorter of the two 
has a length divisible evenly by 4 (that is, Length Modulo 4 == 0).  
This happens because a base64 string is interpreted 4 bytes at a time, 
each byte representing 6 bits in the target string, and meaning that 1 
byte requires at least two base64 characters to represent it.  Because 
of this, the last character will not be interpreted as part of the 
salt, causing the identical hashes to be produced.

In general, you should never use a salt that is shorter than the 
maximum allowed.
}
----


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-02-17 05:39 UTC] cscott at ggot dot org
Note: testing on a 64-bit system reveals that, at least for the built-in 
implementation, 31 is a hard upper limit, and not OS/CPU dependent.  So, 
for Blowfish, the valid range of the second parameter is 04-31.

Also, str_replace('for the face that','for the fact that',$OP); please 
and thanks.
 [2010-02-17 18:20 UTC] joey@php.net
I've tested on x86_64, sparc64, x86, and POWER5 - all these platforms
respect the hard-coded limit of 04-39. Which architecture are you
using, and what does it do for 31-39?

An example from my x86_64 machine, using the PHP-provided Blowfish
implementation:

php > var_dump(PHP_INT_MAX);
int(9223372036854775807)
php > var_dump(crypt('a', '$2a$39$a'));
string(60)
"$2a$39$a$$$$$$$$$$$$$$$$$$$$.2H37F8WmHJFqckIBxyxT7LF9.kojfOS"

I agree that it should not be OS or CPU dependent - it's limited by
the C code:

            salt[0] == '$' &&
            salt[1] == '2' &&
            salt[2] == 'a' &&
            salt[3] == '$' &&
            salt[4] >= '0' && salt[4] <= '3' &&
            salt[5] >= '0' && salt[5] <= '9' &&
            salt[6] == '$') {

 [2010-02-20 02:06 UTC] cscott at ggot dot org
Testing on Windows Vista x64 / PHP 5.3.0

for ($i = 32; $i < 40; ++$i) {
    var_dump(crypt('a', '$2a$' . $i . '$a'));
}
echo "\n\n";
for ($i = 0; $i < 8; ++$i) {
    var_dump(crypt('a', '$2a$0' . $i . '$a'));
}

Output:

string(0) ""
string(0) ""
string(0) ""
string(0) ""
string(60) 
"$2a$36$a$$$$$$$$$$$$$$$$$$$$.ab4uk7zdS0i/IlhKXBIf8klDpk4gLJu"
string(60) 
"$2a$37$a$$$$$$$$$$$$$$$$$$$$.2vjzcADwWgyZ3m10pK7p1HZi2sECLzS"
string(60) 
"$2a$38$a$$$$$$$$$$$$$$$$$$$$.7jYCBxjwtLJUUDcOWfK.czjNbZO.LpW"
string(60) 
"$2a$39$a$$$$$$$$$$$$$$$$$$$$.2H37F8WmHJFqckIBxyxT7LF9.kojfOS"


string(0) ""
string(0) ""
string(0) ""
string(0) ""
string(60) 
"$2a$04$a$$$$$$$$$$$$$$$$$$$$.ab4uk7zdS0i/IlhKXBIf8klDpk4gLJu"
string(60) 
"$2a$05$a$$$$$$$$$$$$$$$$$$$$.2vjzcADwWgyZ3m10pK7p1HZi2sECLzS"
string(60) 
"$2a$06$a$$$$$$$$$$$$$$$$$$$$.7jYCBxjwtLJUUDcOWfK.czjNbZO.LpW"
string(60) 
"$2a$07$a$$$$$$$$$$$$$$$$$$$$.2H37F8WmHJFqckIBxyxT7LF9.kojfOS"

Notice how 32-35 AND 00-03 are invalid, and 36-39 AND 04-08 produce 
identical results.  Another easy way to tell it didn't calculate at 
the full value of '39' is as follows:

On my system, according to my simple benchmark, I can (via PHP) 
calculate about 133-134 hashes per second with an argument of '07';  
Since each iteration of that argument should just about double the 
calculation time (e.g., when I increment to '08' only 67 hashes can be 
generated per second).  Extrapolating at roughly that rate, it should 
take more than one year to generate a hash on my system with the 
argument '39' (roughtly 0.0000000311993062496185302734375 hashes per 
second).  Instead, it returns instantly.  Ostensibly, at 32 the 
parameter is being calculated as 0, at 33 as 1, and so on, until it 
cuts off at 40 and calculates using STD_DES/2-char salt.

By the way, the "benchmark" I used (and yes, I know it will incur some 
overhead, but it is trivial; testing by simply performing each action 
in sequence without loops shows an error factor of about .003 
seconds), is thus:

define('ITERATIONS',5);
 
$tt = $th = 0;
for ($j = 0; $j < ITERATIONS; ++$j) {
	$start = microtime(true);
	for ($i = 0; ($z = microtime(true)) - $start < 1; ++$i) {
		$k = crypt($i, '$2a$07$' . (string)$z);
	}
	$tt += ($z - $start);
	$th += $i;
}
 
var_dump($tt / ITERATIONS, $th / ITERATIONS);
 [2010-02-20 17:00 UTC] joey@php.net
This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.

This was actually identified as a bug in crypt_blowfish and has now been fixed upstream. An upcoming release of PHP will return a boolean FALSE on all of these invalid rounds.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 06:01:29 2024 UTC