php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #62097 New behavior of string == has a compatibility problem
Submitted: 2012-05-21 17:09 UTC Modified: 2012-06-15 00:42 UTC
Votes:12
Avg. Score:4.1 ± 1.3
Reproduced:3 of 6 (50.0%)
Same Version:4 (133.3%)
Same OS:4 (133.3%)
From: kazuo at o-ishi dot jp Assigned: stas (profile)
Status: Closed Package: Scripting Engine problem
PHP Version: 5.4.4RC1 OS: Gentoo Linux
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: kazuo at o-ishi dot jp
New email:
PHP Version: OS:

Further comment on this bug is unnecessary.

 

 [2012-05-21 17:09 UTC] kazuo at o-ishi dot jp
Description:
------------
The behavior of string comparison using == operator is changed in
https://github.com/php/php-src/commit/47db8a9aa19f6e17a1018becf9978315c79a1cb0
to fix bug #54547.

This change has a compatibility problem.

Before this change (PHP 5.4.3),

  "01234" == "1234"
    => TRUE (OK)

  "09223372036854775808" == "9223372036854775808"
    => TRUE (compared as float, but result is acceptable)

After change,

  "01234" == "1234"
    => TRUE (OK)

  "09223372036854775808" == "9223372036854775808"
    => FALSE (compared as string)

This behavior is not reasonable in that case.
New rule is not clear.

I think this change should be reverted, before release of 5.4.4.


Test script:
---------------
echo (("01234" == "1234") ? "true" : "false"), "\n";
echo (("09223372036854775808" == "9223372036854775808") ? "true" : "false"), "\n";


Expected result:
----------------
true
true


Actual result:
--------------
true
false

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-05-21 23:05 UTC] cataphract@php.net
Thanks for your report.

I raised this issue in internals:

http://www.mail-archive.com/internals@lists.php.net/msg58598.html
(see near the end)

and this was deemed a "very narrow use case".

So I'm marking this Wont Fix now, but feel free to take this issue to the internals mailing list.
 [2012-05-21 23:05 UTC] cataphract@php.net
-Status: Open +Status: Wont fix
 [2012-05-21 23:09 UTC] cataphract@php.net
On a closer look, what's mentioned in the e-mail is not exactly the same thing because I was comparing a "always string" comparison and a string comparison as a fallback to a float comparison.

In any case, the problem you're mentioning could only be solved by a custom comparison function, a solution that was not favored in this discussion.
 [2012-05-22 06:22 UTC] kazuo at o-ishi dot jp
I'm not talking about new feature or functionality.
I'm talking about incompatible behavior which has been introduced by
https://github.com/php/php-src/commit/47db8a9aa19f6e17a1018becf9978315c79a1cb0
for bug #54547.


Other case.

Build 32bit PHP.  [PHP_INT_MAX = 2147483647]

Using PHP 5.3.3 (before change):

 "02147483647" == "2147483647"
    => TRUE (OK)

 "02147483648" == "2147483648"
    => TRUE (OK: compared as float, and float can handle this precision)


But using PHP 5.3.4RC1:

 "02147483647" == "2147483647"
    => TRUE (OK)

 "02147483648" == "2147483648"
    => FALSE (BAD!)
 [2012-05-22 09:48 UTC] kazuo at o-ishi dot jp
Sorry, typo in PHP version.
  PHP 5.3.3    -> PHP 5.4.3
  PHP 5.3.4RC1 -> PHP 5.4.4RC1
 [2012-05-22 10:03 UTC] cataphract@php.net
Ah, very well. The code is indeed wrong for 32-bit machines. I hadn't considered that.
 [2012-05-22 10:03 UTC] cataphract@php.net
-Status: Wont fix +Status: Assigned -Assigned To: +Assigned To: cataphract
 [2012-05-22 11:00 UTC] cataphract@php.net
-Assigned To: cataphract +Assigned To: stas
 [2012-05-23 23:59 UTC] stas@php.net
Automatic comment on behalf of cataphract
Revision: http://git.php.net/?p=php-src.git;a=commit;h=acd711685a592c52be200e248154283c6c49c9f8
Log: Fixed bug #62097
 [2012-05-24 09:10 UTC] cataphract@php.net
Automatic comment on behalf of cataphract
Revision: http://git.php.net/?p=php-src.git;a=commit;h=acd711685a592c52be200e248154283c6c49c9f8
Log: Fixed bug #62097
 [2012-05-26 11:52 UTC] cataphract@php.net
-Status: Assigned +Status: Closed
 [2012-05-26 11:52 UTC] cataphract@php.net
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.

 For Windows:

http://windows.php.net/snapshots/
 
Thank you for the report, and for helping us make PHP better.


 [2012-05-27 18:11 UTC] kazuo at o-ishi dot jp
-Status: Closed +Status: Assigned
 [2012-05-27 18:11 UTC] kazuo at o-ishi dot jp
Hi,

This is one more case in 32-bit build:

  "1234567890123456789" == "12345678901.23456789E8"
     => TRUE in 5.4.3
        FALSE in latest


In addition, as I had already reported,

(64-bit environment)

   "9223372036854775808" == "09223372036854775808"
     => TRUE in 5.4.3
        FALSE in latest

   "9223372036854775808" == " 9223372036854775808"
     => TRUE in 5.4.3
        FALSE in latest

(32-bit environment)
  
   "9007199254740992" == "9007199254740992."
     => TRUE in 5.4.3
        FALSE in latest

   "9007199254740992" == " 9007199254740992"
     => TRUE in 5.4.3
        FALSE in latest

   "9007199254740992" == "09007199254740992"
     => TRUE in 5.4.3
        FALSE in latest

----------------------
I think that NEW RULE of == comparing is difficult
to understand and there are some incompatibility from OLD RULE.
At least, it is needed to be described explicitly.
(Of course, OLD RULE is already complex enough...)

OLD RULE:

 When both strings look like numbers, they are converted to numbers
 before == comparing.

 Conversion rule from string to number is described in
 http://www.php.net/manual/en/language.types.string.php#language.types.string.conversion
 
    If the string does not contain any of the characters '.', 'e', or
    'E' and the numeric value fits into integer type limits (as
    defined by PHP_INT_MAX), the string will be evaluated as an
    integer. In all other cases it will be evaluated as a float.

And NEW RULE?
 [2012-05-28 03:00 UTC] stas@php.net
-Assigned To: stas +Assigned To: cataphract
 [2012-05-28 09:07 UTC] cataphract@php.net
-Status: Assigned +Status: Closed
 [2012-05-28 09:07 UTC] cataphract@php.net
The old rule is:
If the strings look like numbers (i.e they follow the notation for a decimal or hexadecimal integer once any leading whitespace or leading zeros -- immediately before the first non-zero digit -- are ignored), then they are compared as numbers, except if the conversion result in infinite values with the same side, in which case they are compared as strings. The number comparison is a double comparison if any of the strings is converted to a double (due to a decimal separator, exponent or the number being too large in absolute value) and it's an integer comparison otherwise.

The new rule is the same as the old rule, with the following limitation of numeric comparisons:
If both strings look like integers (no decimal separator nor exponent) but they were both converted to doubles because of being too large in absolute value, if they both compare equal in a double comparison, and if they're both larger than 2^53-1 in absolute value, then compare them as a string.

In light of this:

"1234567890123456789" == "12345678901.23456789E8" (32-bit)
1234567890123456789 > 2^31-1, so it cannot be represented as long => converted to double => compared as double => compares equal => 1234567890123456789 > 2^53-1  => compare as string => FALSE

"9223372036854775808" == "09223372036854775808" (64-bit)
1234567890123456789 > 2^63-1, the same follows

"9223372036854775808" == " 9223372036854775808" (64-bit)
idem

For the rest of the 32-bit examples, 9007199254740992 is also larger than 2^53-1, so the the string comparison will be triggered.

I agree that this is not a model of simplicity -- I'd have preferred a custom string comparison for number-like values -- but it's working as expected.
 [2012-05-28 10:29 UTC] kazuo at o-ishi dot jp
-Status: Closed +Status: Assigned
 [2012-05-28 10:29 UTC] kazuo at o-ishi dot jp
> The new rule is the same as the old rule, with the following
> limitation of numeric comparisons: If both strings look like integers
> (no decimal separator nor exponent) but they were both converted to
> doubles because of being too large in absolute value, if they both
> compare equal in a double comparison, and if they're both larger than
> 2^53-1 in absolute value, then compare them as a string.

Why this change is Right, in spite of breaking backward compatibility
from 5.4.3 to 5.4.4?  What kind of tests are needed to the users to
migrate PHP 5.4.3 to PHP 5.4.4 safely?

I know this change was introduced to fix Bug #54547, but it's side
effects are not small.

I agree your first comment on #54547 :

>> Maybe this should be Won't Fix to keep it consistent with
>> 9223372036854775807 == 9223372036854775808 (with number literals).

------------------

> I agree that this is not a model of simplicity -- I'd have preferred a
> custom string comparison for number-like values -- but it's working as
> expected.

This need "Backward Incompatible Changes" section like
http://www.php.net/manual/en/migration54.incompatible.php ...

Please keep compatibility. 
Even though this change will be done, I would like you to do it for PHP 5.5.
 [2012-05-28 11:02 UTC] cataphract@php.net
I'm reassigning to stas, as which branches changes go to is primarily his decision.
 [2012-05-28 11:02 UTC] cataphract@php.net
-Assigned To: cataphract +Assigned To: stas
 [2012-05-29 01:14 UTC] stas@php.net
-Status: Assigned +Status: Feedback
 [2012-05-29 01:14 UTC] stas@php.net
Old comparison ("09223372036854775808" == "9223372036854775808") is a problem 
since it may lead to md5() hashes accepting wrong passwords. This has very high 
risk of negative consequences for many PHP users. Thus we decided to fix it 
ASAP.
This change can not be backward compatible - since the whole point was to change 
how this comparison works. 

I have very hard time understanding why your code would rely on comparison 
between two completely distinct strings return "true" - what exactly this code 
does and why it relates on == comparison truncating long numbers? Maybe if you 
explain the need better there would be a reason to postpone this change but for 
now I do not see a reason that would override very real security concern from 
bug #54547.
 [2012-05-29 03:25 UTC] kazuo at o-ishi dot jp
-Status: Feedback +Status: Assigned
 [2012-05-29 03:25 UTC] kazuo at o-ishi dot jp
> Old comparison ("09223372036854775808" == "9223372036854775808") is
> a problem since it may lead to md5() hashes accepting wrong
> passwords. This has very high risk of negative consequences for many
> PHP users. Thus we decided to fix it ASAP.

Clearly, it have to be compared using === instead of ==.

We should issue the statement at the PHP site:

  "String to string comparison using == has many problem including
   security problems.  We strongly recommend to use === instead."

And we should add new Security section at
http://www.php.net/manual/en/security.php

In addition, string to string comparison using == should be mentioned as NOT
RECOMMENDED explicitly on
http://www.php.net/manual/en/language.operators.comparison.php

These actions can be done at right now, and it is effective for all
PHP users including users of old versions.


Fortunately, it's rare the return value is constructed only decimal
number since md5() returns 32-character hexdecimal string
[(10/16)^32 -> 0.000029% ?].
And, for the application developers, it is comparatively easy to point
out and correct a part with such a problem.


> This change can not be backward compatible - since the whole point
> was to change how this comparison works.

Security fix which breaks compatibility in wide and unexpectable area
will not work well, because it makes the users difficult to migrate to new version.


> I have very hard time understanding why your code would rely on comparison 
> between two completely distinct strings return "true" - what exactly this code 
> does and why it relates on == comparison truncating long numbers?

That's HISTORICAL REASON.
There are many of legacy code with == comparison in the world,
they need to work AS IS.

If there is such a incompatibility, the maintainers of that code have
to check whole code using == operator before upgrading to PHP 5.4.4.
 [2012-05-29 05:33 UTC] stas@php.net
I am sorry, but I still do not see the explanation why any code would rely on 
comparison such as "09223372036854775808" == "9223372036854775808" to return 
true. If you have such reason - please bring it forward. Repeating "HISTORICAL 
REASON" does not explain anything. If you have valid use case why it needs to be 
preserved - please provide it ASAP. Otherwise the only thing to conclude will be 
that no application has a valid reason to rely on that and the right thing will 
be to change it, as this code should behave this way from the start.
 [2012-05-29 09:04 UTC] kazuo at o-ishi dot jp
Interface to user inputs or external format may rely on current
(released version of ) PHP behavior, ignore leading zeros or whitespaces.
I'm sorry that I cannot show you simple example.


Instead, this is example to show another incompatibility:

There is pgsql database table created by
  CREATE TABLE t (k INT, v NUMERIC(30,1));

This code print "true" on PHP 5.4.3 but "false" on latest.

-------------------------------------------------
<?php
$key = "1";
$num = "12345678901234567890";

$con = pg_connect("...");
pg_query($con, "DELETE FROM t");
pg_query($con, "INSERT INTO t (k, v) VALUES ($key, $num)");
$rs = pg_query($con, "SELECT v FROM t WHERE k = $key");
$row = pg_fetch_row($rs);
if ($row[0] == $num) {
    echo "true";
} else {
    echo "false";
}
-------------------------------------------------
 [2012-05-30 06:17 UTC] stas@php.net
Automatic comment on behalf of cataphract
Revision: http://git.php.net/?p=php-src.git;a=commit;h=78ff9ebb6bb501dff995727512c38fdeff50021b
Log: Fixed bug #62097
 [2012-05-30 07:09 UTC] stas@php.net
-Status: Assigned +Status: Feedback
 [2012-05-30 07:09 UTC] stas@php.net
What is in $row[0] in your example?
 [2012-05-30 07:18 UTC] kazuo at o-ishi dot jp
-Status: Feedback +Status: Assigned
 [2012-05-30 07:18 UTC] kazuo at o-ishi dot jp
Please think that it is money (or member ID).
But why?
 [2012-05-30 08:57 UTC] kazuo at o-ishi dot jp
In summary, this is my opinion:

Recent changes on string == have problems on compatibility.

Impact of the behavior change of == operator is heavy.
This change should be excluded from PHP 5.4.x series.

I'm sure that current (old) behavior also have problems, but
compatibility is better.


If it will be changed, it should be a natural extension of current
behavior; number-like string is compared as number.
(e.g. canonicalize string as number way before memcmp.)

Of course, such behavior should be described explicitly in PHP Manual.
 [2012-05-31 00:46 UTC] stas@php.net
I do not see "heavy" impact - so far I did not see any code sample that did 
something that makes sense in 5.4.3 but not on 5.4.4. If you have such code 
sample and can explain what data it accepts, what it does and why it relies on 
string comparisons cutting numbers, please do so. Your database example is 
missing data, so I can not see what is going on there and why you think it works 
differently in 5.4.3 and 5.4.4.
 [2012-05-31 02:36 UTC] kazuo at o-ishi dot jp
I have shown test cases that work on released version 5.4.3
but not work on developing version.

Now, YOU need explain real merit of this backward incompatible change.
md5() is not enough reason, because it should always be compared by
=== instead of ==.

Generally, at the case when new behavior (memcmp for large
value) is acceptable, we can and we should just use ===.


> If you have such code 
> sample and can explain what data it accepts, what it does and why it relies on 
> string comparisons cutting numbers, please do so. Your database example is 
> missing data, so I can not see what is going on there and why you think it works 
> differently in 5.4.3 and 5.4.4.

(I'm sorry but I cannot understand what you say in this two sentence.
Could you explain detail?)


In JPY (Japan Yen), we normally use it in integer (e.g. 100 yen).
But sometimes it take fraction (e.g. foreign exchange 1 USD = 78.80 JPY).
So database column type with fraction is reasonable.
And set to / get from the column in integer form is also reasonable.

Again, I just report incompatibility from PHP 5.4.3 to PHP 5.4.4RC.

This is wrong way if you want to fix security problem, because
incompatible change makes the users difficult to migrate to new version.
 [2012-05-31 06:23 UTC] stas@php.net
-Status: Assigned +Status: Wont fix
 [2012-05-31 06:23 UTC] stas@php.net
I believe I did explain the reason and I believe this is reason enough. If you 
disagree, please feel free to raise it on internals list and if enough of the 
community supports you it will be reversed. So far I did not hear any more 
complaints about it. I think it is clear that there is a disagreement between us 
about how to handle this, and more discussion is not going to bring any 
improvement. I am closing this bug, if you feel more discussion is required 
please raise it on the list.
 [2012-06-01 20:44 UTC] steve at home dot com
kazuo at o-ishi dot jp:

Just don't expect the PHP type system to make any sense, it's easier than trying to understand it at this point. Obviously the developer you are talking to does not understand this issue. If you want to have reliable behavior, PHP is not for you (unless you stick to a single version and work out the oddities).
 [2012-06-01 21:26 UTC] stas@php.net
-Block user comment: No +Block user comment: Yes
 [2012-06-01 21:26 UTC] stas@php.net
OK, we got trolls here, so I'm blocking comments to it. If you want to continue 
discussion, you're welcome to raise it on the list.
 [2012-06-15 00:42 UTC] hholzgra@php.net
Ok, so the problem is not only with leading zeros but also with trailing decimals
, eg.

  "12345678901234567890" == "12345678901234567890.0"

will now return false while it returned true before (that is the essence of the pgsql example further above).

This in combination with NUMERIC or DECIMAL fixed point values in databases does indeed look like a more likely BC issue to run into than the leading zero case. 

On the other hand we're in "never compare floats for equality" land here ...
 [2014-10-07 23:25 UTC] stas@php.net
Automatic comment on behalf of cataphract
Revision: http://git.php.net/?p=php-src-security.git;a=commit;h=acd711685a592c52be200e248154283c6c49c9f8
Log: Fixed bug #62097
 [2014-10-07 23:25 UTC] stas@php.net
-Status: Wont fix +Status: Closed
 [2014-10-07 23:36 UTC] stas@php.net
Automatic comment on behalf of cataphract
Revision: http://git.php.net/?p=php-src-security.git;a=commit;h=acd711685a592c52be200e248154283c6c49c9f8
Log: Fixed bug #62097
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Nov 23 08:01:28 2024 UTC