php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #48012 String "0" conversion to boolean FALSE is not logical.
Submitted: 2009-04-18 19:05 UTC Modified: 2009-04-18 21:11 UTC
Votes:12
Avg. Score:4.6 ± 0.8
Reproduced:12 of 12 (100.0%)
Same Version:5 (41.7%)
Same OS:7 (58.3%)
From: dave at ifox dot com Assigned:
Status: Wont fix Package: Feature/Change Request
PHP Version: 5.3.0RC1 OS: All
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2009-04-18 19:05 UTC] dave at ifox dot com
Description:
------------
PHP Developers,

I typed a very logical explanation of why more careful thought should have been put into the addition of this conversion:

  "0"  =>  FALSE

but the submission failed and so I will try to reproduce it.

When I learned about the exception of the string "0" converting to a boolean false, I had a sick feeling in my stomach. An entire team of language developers fell down the slippery slope caused by a handful of programmers in 2003 and 2004 that wanted to shorten their code, and thought that "0" -> TRUE was a bug.

Every other language in existence converts a non-empty string to a boolean TRUE, because they prevail in this logic:  the only meaning that SHOULD be derived from an alphanumeric string is its alphanumeric content, unless first cast to a numeric (or other) type.  For a language that tries to bring in the best of other languages, PHP should at least mimic the logical behavior of those languages.

I have edited dozens of web sites to fix well-structured code by skilled programmers that didn't expect this behavior, particularly when checking for the existence of strings from databases or user input (form POSTs).  As an example:

  if ($_POST["uid"] && ...) { ... }

now must be changed to:

  if (strlen($_POST["uid"]) && ...) { ... }

and sometimes there are dozens of these in series.  This begs programmers to be sloppy and just let "0" fail as an unusual case even when valid as input.  And even skilled programmers can't be expected to read and catch this tiny exception in your documentation.

There are other ramifications from not respecting the alphanumeric string for the purpose it was intended, to hold alphanumeric values.  I will not expound here.  The fact that FORM POSTs yield strings is not something to streamline in to the assumption that a user "meant" to enter an integer.  Only a beginning programmer wanting a shortcut hack would expect or want this behavior.

And what of " 0", "0 ", "00", "false", and "0.0"?  They are respected!  Originally I thought the "0" conversion was an attempt to make bool(string(FALSE)) == FALSE, but it already does since string(FALSE) == "" (although in other languages, it yields TRUE, because it is appropriate to conclude that the string representation of any boolean value has length and is therefore TRUE).

JavaScript, as a common example, understands what an alphanumeric string is for:

  <script>
    if ('0') document.write('YES');
  </script>

This yields "YES", of course.  Unfortunately there are now people out there posting that JavaScript is broken.  But they are beginners, of course.  A language should never make assumptions about a programmer's users' intent when providing input.  If a user intends "0" as a string, why assume that is a numerical value?  Don't you need to now assume all sorts of zero-value strings?

I have developed two loosely-typed languages, and I made the choice to treat non-empty strings as TRUE.  I find PHP for the web very usable, but I was completely surprised by this choice, and I'm sorry to say that it has resulted in high-quality code yielding unexpected subtle failures for its users.

I modified PHP (14 or so changes in the Zend engine) to remove the feature and I made a patch.  The ill stomach went away, and PHP now respects alphanumeric strings, but now I am uniquely conscious in a world of assumptions yielding unexpected results.  I guess that's the beauty and curse of open source.

What were the thought processes in creating this "feature"?  Please consider its removal!

Dave May


Reproduce code:
---------------
$str = "0";
if ($str) echo "TRUE"; else echo "FALSE";
---
From manual page: language.types.boolean
---

Expected result:
----------------
TRUE

Actual result:
--------------
FALSE

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-04-18 19:28 UTC] rasmus@php.net
PHP is first and foremost a Web language, not a general-purpose scripting language.  Since the Web is not typed and everything is a string, I had to do things slightly differently early on to make PHP do what people expected.  Specifically, "123"==123 needs to be true in order to not have to type cast every single numeric user input.  Given that, then it also follows that '0'==0 and if you continue with that and consider that 0==false then it makes sense that '0'==false.

However, '0'===false is, of course, false.  This is why we have the strict type-comparison operators in PHP.  

Basically if we change '0' to be true, then we also have to trickle that change up resulting in '123'!=123 which would break every app out there.  So, while I understand your point, it simply isn't going to happen.
 [2009-04-18 20:10 UTC] dave at ifox dot com
Thanks for the response, but I see you misunderstood my post.  I was talking about string-to-boolean conversion, not string-to-integer conversion.

Implicit conversion of string to integer is correct in PHP:

  '0' == 0      yields TRUE, of course
  '123' == 123  yields TRUE, of course
  ' 45 ' == 45  yields TRUE, of course
  '' == 0       yields TRUE, of course

I'm talking only about the special case that doesn't hold up well logically.  In conversion to boolean (explicit or implicit):

  'Hello'       yields TRUE, string is non-empty
  ' '           yields TRUE, string is non-empty
  '123'		yields TRUE, string is non-empty
  '00'          yields TRUE, string is non-empty (regardless of zero value)
  ' 0'		yields TRUE, string is non-empty (regardless of zero value)
  '0 '		yields TRUE, string is non-empty (regardless of zero value)
  '0.0'		yields TRUE, string is non-empty (regardless of zero value)
  '0x0'		yields TRUE, string is non-empty (regardless of zero value)
  ''            yields FALSE, string is empty
  '0'		yields FALSE, even though string is non-empty, simply because of a single ASCII '0' ???

But wait, '0' is an alphanumeric string! PHP is now the only language in the world, web or otherwise, that would make an assumption about a string's NUMERIC value when converting to a BOOLEAN. It may have been more appropriate to follow other languages which only analyze the presence of CONTENT in the string.

Am I at least making sense?

Obviously you won't take the step to assuming '0.0' is false, or any of the silly ideas people have submitted as reports ('false'), but why take the initial step?

By the same logic that you would argue '0.0' is an alphanumeric string and ' 0 ' is an alphanumeric string, and should not be interpreted as a boolean false, you should argue that '0' gets the same protection from coercion to a numeric value just for the boolean evaluation (and test).

Of course, you could have deviated more drastically, and performed a numerical evaluation of every string, and see if it contains a zero number, and that would be incredibly inefficent, but it would follow your logic.

If you do see my point, and have compared to other languages, you may see what I'm talking about.

A change of this behavior would yield fewer errors by programmers that liked the C-, Java- and JavaScript-esque beauty of PHP but didn't catch it in the documentation (most programmers). However, I realize why you might not change the behavior -- existing code might make assumptions about user input that is intended as numerical and where tests against that numerical value are made.  But realize those programs are already having to convert to integer anyway, because " 0" would be interpreted as non-zero.  You see?

I just wanted to comment, and hope that you recognize that this is unusual, and it leads to broken programs -- I know, I've fixed them!

Thanks of course!
 [2009-04-18 20:38 UTC] rasmus@php.net
No, I didn't misunderstand.  
if($val) is equivalent to if($val==true)
 [2009-04-18 21:11 UTC] dave at ifox dot com
That's not what I was saying was being misinterpreted.

You took a failed logical path.  To quote you:

---
Specifically, "123"==123 needs to be true in
order to not have to type cast every single numeric user input.  Given that, then it also follows that '0'==0 and if you continue with that and consider that 0==false then it makes sense that '0'==false.
---

And yet '0 '==true?  (note the space here and after)

By your own logic, '0 '==0 and therefore it makes sense that '0 '==false.  But '0 '==true in PHP.  So you DO have to "type cast every single numeric user input" (your words), whereas your point is that because of this special '0' check, you don't.  False conclusion.  The purpose of the check for a single alphanumeric zero in a string gained you nothing, you see.  And trimming the string doesn't correct '00', or '-0', to further illustrate.

Of course it shouldn't. My point is that you *should* have to cast strings to integer if doing a simple non-zero test.

Of course the additional problem is now you have to call strlen() to check for the presence of content, to sidestep this check.

I will note that by your logic, since 'joe'==0, it should follow that 'joe'==false. Hence the contradictions that arise when you try to prove your point.

It's not a bug, and it's documented. No problem with that. Apparently it's been this way from as early as 2002. If I don't like that PHP is illogical, I can write my own language. Oh wait, I did. I'm just trying to help you folks out, and make it so these programs that end up on my desk don't have such strange behavior.

Cheers!
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat May 04 11:01:32 2024 UTC