php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #54547 wrong equality of string numbers
Submitted: 2011-04-16 20:07 UTC Modified: 2012-05-13 21:51 UTC
Votes:95
Avg. Score:3.9 ± 1.5
Reproduced:50 of 67 (74.6%)
Same Version:38 (76.0%)
Same OS:35 (70.0%)
From: peter dot ritt at gmx dot net Assigned: dmitry
Status: Closed Package: Unknown/Other Function
PHP Version: 5.3.6 OS: linux
Private report: No CVE-ID:
 [2011-04-16 20:07 UTC] peter dot ritt at gmx dot net
Description:
------------
comparison of strings using == shows wrong results when both strings are numbers (digits) around PHP_MAX_INT;
the same comparison using === works correctly;
tested on 64 bit systems only, affects also PHP 5.3.5

Test script:
---------------
$a = '9223372036854775807';
$b = '9223372036854775808';
if ($a == $b) {
    echo "$a == $b\n";
}
else {
    echo "$a != $b\n";
}
// displays 9223372036854775807 == 9223372036854775808


Expected result:
----------------
should display
9223372036854775807 != 9223372036854775808

Actual result:
--------------
displays
9223372036854775807 == 9223372036854775808

Patches

bug54547-2.diff (last revision 2011-04-17 03:44 UTC) by cataphract@php.net)
bug54547.diff (last revision 2011-04-16 23:59 UTC) by cataphract@php.net)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2011-04-17 01:58 UTC] cataphract@php.net
-Status: Open +Status: Verified
 [2011-04-17 01:59 UTC] cataphract@php.net
The following patch has been added/updated:

Patch Name: bug54547.diff
Revision:   1302998399
URL:        http://bugs.php.net/patch-display.php?bug=54547&patch=bug54547.diff&revision=1302998399
 [2011-04-17 02:03 UTC] cataphract@php.net
-Assigned To: +Assigned To: dmitry
 [2011-04-17 05:44 UTC] cataphract@php.net
The following patch has been added/updated:

Patch Name: bug54547-2.diff
Revision:   1303011843
URL:        http://bugs.php.net/patch-display.php?bug=54547&patch=bug54547-2.diff&revision=1303011843
 [2011-04-18 03:04 UTC] cataphract@php.net
Maybe this should be Won't Fix to keep it consistent with 9223372036854775807 == 9223372036854775808 (with number literals).
 [2012-04-11 07:47 UTC] foobla at spambog dot com
I don't think it's about PHP_MAX_INT, rather about the maximum precision of a double/float. "==" converts both strings to numbers (after spending CPU cycles to detect whether they look like numbers), as described in http://www.phpsadness.com/sad/47

once converted, the floats seem to actually *be* equal, even with "===":

php -r '
$a = (double)"9223372036854775807";
$b = (double)"9223372036854775808";
var_dump($a, $b, $a == $b, $a === $b);
'
float(9.2233720368548E+18)
float(9.2233720368548E+18)
bool(true)
bool(true)
 [2012-04-11 08:44 UTC] net at janoszen dot hu
Same problem:

php > var_dump('0xff' == '255');
bool(true)
 [2012-04-11 10:45 UTC] hholzgra@php.net
If we indeed want to change the behavior here (and i'm yet undecided to whether i'd want to do this or not, although slightly biased towards a 'yes):

wouldn't it be easier (although probably slightly less effective performance wise) to do a string comparison first if both arguments are strings, and only fall back to numeric auto casts if the string comparison fails?

If the strings really contain different numeric literals i'd expect a string comparison to fail quickly as there can only be so much digits (ok, in theory you could have up to 300+ digits, but not all of them significant).

This would take care of all possible edge cases (assuming that there may be others that we aren't aware of yet, even though i can't think of any right now) and not just the overflow case at hand, and the required engine changes would probably be a single chunk only, so having better patch locality ...

Or are there other places where we'd need an extended is_numeric_string check with overflow control, too?
 [2012-04-11 10:52 UTC] hholzgra@php.net
On "0xFF" == 255:

since when do we actually consider hex in strings as numeric?
And is this actually documented?

The 
The "String conversion to numbers" section in the manual says:

"Valid numeric data is an optional sign, followed by one or more digits (optionally containing a decimal point), followed by an optional exponent. The exponent is an 'e' or 'E' followed by one or more digits."

( 
http://www.php.net/manual/en/language.types.string.php#language.types.string.conversion
 )

By that description 0xsomething would *not* be considered
as numeric in a string context ...
 [2012-04-11 13:12 UTC] nik at naturalnet dot de
*Why* the heck is that implicit cast even done?

Are PHP developers really _that_ absent-minded that they cannot write actual number literals when they want them (i.e. leave out the '')?

I expect any programming language to use the data types I give it, not something it likes more!
 [2012-04-11 13:36 UTC] pajoye@php.net
@nik at naturalnet dot de

Please stay polite with other developers.

Please keep in mind that PHP is loosely typed, this is a root principle of PHP.
 [2012-04-12 05:42 UTC] a at hotmail dot com
@pajoye@php.net

How about *you* staying polite with your users by actually fixing your bugs? Can you imagine how much time is wasted worldwide as a consequence of your shortsighted "design" decisions?
 [2012-04-12 06:39 UTC] pajoye@php.net
@a at hotmail dot com

This is not a support channel, if you need further support for the base ideas 
about the loosely type nature of PHP, please ask them on one the numerous 
channels.
 [2012-04-12 06:39 UTC] pajoye@php.net
@a at hotmail dot com

This is not a support channel, if you need further support for the base ideas 
about the loosely type nature of PHP, please ask them on one the numerous 
channels.
 [2012-04-12 06:39 UTC] pajoye@php.net
@a at hotmail dot com

This is not a support channel, if you need further support for the base ideas 
about the loosely type nature of PHP, please ask them on one the numerous 
channels.
 [2012-04-12 13:31 UTC] Jeff at bobmail dot info
I'm confused as to why there is even a conversation around "should we fix this".

The data objects are strings. Sure, PHP is "loosely typed" but shouldn't it do the comparison you tell it to do first before attempting anything else?

I agree with the previous suggestion: make it a real string comparison and drop the type casting.
 [2012-04-12 13:51 UTC] jabakobob at gmail dot com
The conversion to a number is necessary because programmers don't differentiate 
between strings and numbers in PHP. Consider the following code:

if ($_GET["a"] == $_GET["b"]) echo "a is same as b!";

The result will be the same if the query string is ?a=1&b=1 or ?a=1&b=1.0 or ?
a=01&b=1 because PHP is loosely typed.

Internally $_GET["a"] and $_GET["b"] are both strings, but we can't do a string 
comparison. If you want a string comparison, use strcmp.
 [2012-04-12 13:58 UTC] pajoye@php.net
@Jeff at bobmail dot info

that's what === is for (real comparisons without casting).
 [2012-04-12 13:59 UTC] nikic@php.net
@Jeff: You have to understand in PHP 1, 1.0 and "1.0" all are equivalent (in most situations). That's by design.

E.g. GET and POST variables are always strings, even if you put numbers into them (as per the HTTP standard). PHP obviously wants those GET/POST variables to still be useable just like they were numbers, that's why "1" and 1 can be used interchangeably throughout PHP.

In that context - in my eyes - this comparison also makes sense. Consider a very similar comparison:

    var_dump('0.1' == '0.10000000');

What would you expect to be the output - if you remember that in PHP numeric strings and actual numbers are interchangeable? Clearly it has to behave exactly as if you had written:

    var_dump(0.1 == 0.10000000); // => bool(true)

In most cases this type of comparison is what you want and it usually works exactly as expected.

What you see here in this issue is one of the edge cases (how often do you use large numbers in PHP?) where it does not work well.

I hope you understand that it is not viable to remove a handy feature from PHP, just because it fails under certain edge case conditions.

If you want to use a strict string comparison, just use ===.
 [2012-04-12 14:02 UTC] Jeff at bobmail dot info
That didn't address my comment. Why wouldn't the internal implementation check to see if the strings are the same? When doing a comparison and the internal data type is a string, wouldn't that be faster and most correct?

In all honesty I would prefer PHP's "loosely typed" system mimic JavaScript's in that any type can be put anywhere but the object still keeps its type information for situations just like this.
 [2012-04-12 14:17 UTC] nikic@php.net
@Jeff Please see jabakobob's comment why doing just a string comparison can be counterproductive. Remember: PHP is mainly used around the HTTP protocol (where everything is a string) and MySQL (where also everything is returned as a string). So in PHP you will often deal with numbers in strings, thus they should be handled as such.
 [2012-04-12 15:20 UTC] jpauli@php.net
I'd like to add that strcmp() and familly are functions designed to compare 
strings, as they are in C ; except that in PHP they are binary compatible, like 
PHP strings are
 [2012-04-12 15:55 UTC] yless42 at hotmail dot com
Wouldn't it make the most sense to compare the strings as string (and thus pass in the original case), then fall back on other comparison methods when they don't match?  I admit I don't have test cases but it seems that this would be backwards compatible in most cases (as you will eventually compare numerically) and fix the given issue.

Unless there are cases which rely on the two same strings failing to compare as equal.
 [2012-04-12 16:04 UTC] jacob at fakku dot net
I'm just gonna paste in that PHP Sadness article to show why this is such a big 
issue.

According to php language.operators.comparison, the type-coercing comparison 
operators will coerce both operands to floats if they both look like numbers, 
even if they are both already strings:

If you compare a number with a string or the comparison involves numerical 
strings, then each string is converted to a number and the comparison performed 
numerically.
This can become especially important in situations where the developer chooses 
to use == to compare two values which will always be strings. For example, 
consider a simple password checker:

if (md5($password) == $hash) {
  print "Allowed!\n";
}

Assume that the $hash is loaded from a known safe string value from a database 
and contains a real MD5 hash. Now, suppose the $password is "ximaz", which has 
an all-numeric hex-encoded MD5 hash of "61529519452809720693702583126814". When 
PHP does the comparison, it will print "Allowed!" for any password which matches 
even the first half of the hash:

$ php -r 'var_dump("61529519452809720693702583126814" == 
"61529519452809720000000000000000");'
bool(true)

The solution, of course, is "never use type-coercing comparison operators" - but 
this remains an easily-overlooked bug factory for beginning and even 
intermediate developers. Some languages solve this situation by having two 
separate sets of comparison operators for numeric or string comparisons so that 
the developer can be explicit in their intent without needing to manually cast 
their arguments.
 [2012-04-12 16:53 UTC] rasmus@php.net
@jacob PHP has two sets of comparison operators as well. == and ===
They aren't numeric and string, they are loose and strict. In the majority of 
cases when dealing with HTTP requests and database results, which is what PHP 
deals with most, the loose comparison makes life easiest on the developer.

In your case when comparison huge numeric strings that won't fit in any numeric 
type, a strict comparison is needed:

$ php -r 'var_dump("61529519452809720693702583126814" === 
"61529519452809720000000000000000");'
bool(false)

(and hopefully you aren't actually using md5 for password hashing)
 [2012-04-12 17:03 UTC] jacob at fakku dot net
@rasmus

I just wanted to point out the issue mentioned in that article and how I felt it 
applied to this situation.

At least to me, it seems to me that it's a big deal when '9223372036854775807' == 
'9223372036854775808' returns true, even if it's an edge case. But you're right 
about just using ===, which I will do if I ever run into this situation. After 
doing a bit more research I can understand why it is the way it is and I was 
probably too hasty to jump into this thread.
 [2012-04-12 17:09 UTC] riel at surriel dot com
Conversion of numeric-looking strings to numbers does not have to be a problem, as long as the code in the back end uses arbitrary-precision math. This is slower than comparing a type that fits in a CPU register, but once you have already spent the time to do an automatic type conversion, that really does not matter.

When it comes to an operator like ==, every digit matters. Having == return false when two items are different violates the principle of least surprise.
 [2012-04-12 20:32 UTC] b at hotmail dot vom
I would like to point out Perl is a weakly typed language, just like PHP, and has 
no issue with these cases. It's pretty weak from the developers to hide behind 
the "But PHP is weakly typed!" argument.
 [2012-04-12 20:38 UTC] elementation at gmail dot com
It's absolutely unreal that this is even a discussion. PHP, the world doesn't 
take you seriously and with bugs like this you provide further fodder.

Principle of Least Surprise — this should be a string comparison.
 [2012-04-12 21:02 UTC] c at hotmail dot com
"In the majority of cases when dealing with HTTP requests and database results, which is what PHP deals with most, the loose comparison makes life easiest on the developer."

By 'the developer' I assume you mean people who can't type (string) or (int) ? No other language has this issue because they aren't designed around programmers who do not really understand how to program. Please make the developer's life easier by making comparisons make sense.
 [2012-04-12 21:23 UTC] vinny_182 at hotmail dot com
Equality is equality and neither string or numeric representations of the value 
are equal. The bug IMO is in the conversion from string to float, the conversion 
has failed but a valid value is still returned. That's just plain wrong. If you 
wrote unit tests for string to float conversions and this was the input you would 
expect it to return a null value or throw an exception.
 [2012-04-12 22:14 UTC] chx1975 at gmail dot com
Now, while I can understand why PHP chooses "1" == 1 (HTML, sure) I am not too 
sure how is that relevant when both sides are strings?? I am not quite sure why 
the strings "1" and "1.0" would need to be ==. Just because "1" == 1 and "1.0" == 
1 does not mean "1" == "1.0". It's not transitive! Compare FALSE == 0; 0 == 'x'; 
'x' == TRUE -- if it would be transitive then FALSE == TRUE, surely you don't 
want that.
 [2012-04-12 22:45 UTC] erowid at inbox dot lv
I want to marry it, lather this thread up, and have my way with it. I want to have little threads everywhere that are as funny as this xD
 [2012-04-13 01:10 UTC] the dot matt dot kantor at gmail dot com
@hholzgra:  Your only-coerce-on-failure proposal would not solve this issue.

Assuming that by "fail" you mean "the comparison evaluates to false", the strings would end up being coerced anyway (since they are indeed different), 
they'd become identical floats, and things would be the same as they are now.

If I misunderstood what you meant by "fail", then we'd lose "1" == "1.0", which I don't think is something that can (or should) happen.
 [2012-04-13 03:13 UTC] four dot zero dot one dot unauthorized at gmail dot com
This behavior is documented here:
http://php.net/manual/en/language.operators.comparison.php
"If you compare a number with a string or the comparison involves numerical strings, then each string is converted to a number and the comparison performed numerically. These rules also apply to the switch statement. The type conversion does not take place when the comparison is === or !== as this involves comparing the type as well as the value. "

Shouldn't this feature of converting numerical strings to numbers during loose comparison operations between two strings be dropped?  If a developer wanted to compare values given during POST or GET processing AS numbers, they should cast the inputs to (int) or (float) first.  There really should be a fundamental shift away from catering to developer laziness, and force developers to pay more attention to variable and input handling on their own.
 [2012-04-13 07:08 UTC] pajoye@php.net
ok, enough arguing. There is no bug here.
 [2012-04-13 07:08 UTC] pajoye@php.net
-Status: Verified +Status: Not a bug
 [2012-04-13 10:15 UTC] yohgaki@php.net
Just a comment for users who would like to use large numbers.

There are bcmath and gmp modules for large number arithmetic.
 [2012-04-13 10:53 UTC] sesser@php.net
This behaviour is for sure a bug. The == vs. === argument does not apply here.

PHP should not perform the type conversion for the comparison if the result of the 
type conversion does not fit into the actual type converted to.
 [2012-04-13 11:30 UTC] the dot assimilator at gmail dot com
This isn't just a bug, it's a summary of PHP as a language: broken by design.
 [2012-04-13 11:34 UTC] aharvey@php.net
Enough.

Gustavo has written a patch, the technical merits of which can be discussed 
somewhere with less noise. Additionally, it would be nice if the anti-PHP 
circlejerk took place somewhere other than PHP's bug tracker. Hacker News seems 
to enjoy it.

Closing the bug to public comments. Feel free to e-mail me about how I hate 
freedom, if it makes you feel better.
 [2012-04-13 11:34 UTC] aharvey@php.net
-Block user comment: No +Block user comment: Yes
 [2012-04-18 08:23 UTC] hholzgra@php.net
the dot matt dot kantor at gmail dot com: i stand corrected indeed
 [2012-05-13 21:48 UTC] stas@php.net
Automatic comment on behalf of stas
Revision: http://git.php.net/?p=php-src.git;a=commit;h=9344bf193c6e35c8706923953f3e63bb01cc05ed
Log: fix bug #54547
 [2012-05-13 21:51 UTC] stas@php.net
I've added Gustavo's patch to 5.4.
 [2012-05-13 21:51 UTC] stas@php.net
-Status: Not a bug +Status: Closed
 [2012-05-14 18:03 UTC] stas@php.net
Automatic comment on behalf of stas
Revision: http://git.php.net/?p=php-src.git;a=commit;h=47db8a9aa19f6e17a1018becf9978315c79a1cb0
Log: fix bug #54547
 [2012-05-15 07:45 UTC] mike@php.net
Automatic comment on behalf of stas
Revision: http://git.php.net/?p=php-src.git;a=commit;h=9344bf193c6e35c8706923953f3e63bb01cc05ed
Log: fix bug #54547
 [2012-05-20 13:41 UTC] kazuo at o-ishi dot jp
This change has a compatibility problem.

After this change,

 "01234" == "1234"
    => TRUE (OK)

but 

 "09223372036854775808" == "9223372036854775808"
    => FALSE

I think this behavior is not reasonable.
 [2012-05-30 10:32 UTC] kazuo at o-ishi dot jp
Related issue: #62097
https://bugs.php.net/bug.php?id=62097
 [2012-07-24 23:36 UTC] rasmus@php.net
Automatic comment on behalf of stas
Revision: http://git.php.net/?p=php-src.git;a=commit;h=47db8a9aa19f6e17a1018becf9978315c79a1cb0
Log: fix bug #54547
 [2013-11-17 09:32 UTC] laruence@php.net
Automatic comment on behalf of stas
Revision: http://git.php.net/?p=php-src.git;a=commit;h=47db8a9aa19f6e17a1018becf9978315c79a1cb0
Log: fix bug #54547
 
PHP Copyright © 2001-2014 The PHP Group
All rights reserved.
Last updated: Thu Apr 24 23:01:57 2014 UTC