|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2009-04-18 19:05 UTC] dave at ifox dot com
Description:
------------
PHP Developers,
I typed a very logical explanation of why more careful thought should have been put into the addition of this conversion:
"0" => FALSE
but the submission failed and so I will try to reproduce it.
When I learned about the exception of the string "0" converting to a boolean false, I had a sick feeling in my stomach. An entire team of language developers fell down the slippery slope caused by a handful of programmers in 2003 and 2004 that wanted to shorten their code, and thought that "0" -> TRUE was a bug.
Every other language in existence converts a non-empty string to a boolean TRUE, because they prevail in this logic: the only meaning that SHOULD be derived from an alphanumeric string is its alphanumeric content, unless first cast to a numeric (or other) type. For a language that tries to bring in the best of other languages, PHP should at least mimic the logical behavior of those languages.
I have edited dozens of web sites to fix well-structured code by skilled programmers that didn't expect this behavior, particularly when checking for the existence of strings from databases or user input (form POSTs). As an example:
if ($_POST["uid"] && ...) { ... }
now must be changed to:
if (strlen($_POST["uid"]) && ...) { ... }
and sometimes there are dozens of these in series. This begs programmers to be sloppy and just let "0" fail as an unusual case even when valid as input. And even skilled programmers can't be expected to read and catch this tiny exception in your documentation.
There are other ramifications from not respecting the alphanumeric string for the purpose it was intended, to hold alphanumeric values. I will not expound here. The fact that FORM POSTs yield strings is not something to streamline in to the assumption that a user "meant" to enter an integer. Only a beginning programmer wanting a shortcut hack would expect or want this behavior.
And what of " 0", "0 ", "00", "false", and "0.0"? They are respected! Originally I thought the "0" conversion was an attempt to make bool(string(FALSE)) == FALSE, but it already does since string(FALSE) == "" (although in other languages, it yields TRUE, because it is appropriate to conclude that the string representation of any boolean value has length and is therefore TRUE).
JavaScript, as a common example, understands what an alphanumeric string is for:
<script>
if ('0') document.write('YES');
</script>
This yields "YES", of course. Unfortunately there are now people out there posting that JavaScript is broken. But they are beginners, of course. A language should never make assumptions about a programmer's users' intent when providing input. If a user intends "0" as a string, why assume that is a numerical value? Don't you need to now assume all sorts of zero-value strings?
I have developed two loosely-typed languages, and I made the choice to treat non-empty strings as TRUE. I find PHP for the web very usable, but I was completely surprised by this choice, and I'm sorry to say that it has resulted in high-quality code yielding unexpected subtle failures for its users.
I modified PHP (14 or so changes in the Zend engine) to remove the feature and I made a patch. The ill stomach went away, and PHP now respects alphanumeric strings, but now I am uniquely conscious in a world of assumptions yielding unexpected results. I guess that's the beauty and curse of open source.
What were the thought processes in creating this "feature"? Please consider its removal!
Dave May
Reproduce code:
---------------
$str = "0";
if ($str) echo "TRUE"; else echo "FALSE";
---
From manual page: language.types.boolean
---
Expected result:
----------------
TRUE
Actual result:
--------------
FALSE
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Tue Dec 02 05:00:02 2025 UTC |
Thanks for the response, but I see you misunderstood my post. I was talking about string-to-boolean conversion, not string-to-integer conversion. Implicit conversion of string to integer is correct in PHP: '0' == 0 yields TRUE, of course '123' == 123 yields TRUE, of course ' 45 ' == 45 yields TRUE, of course '' == 0 yields TRUE, of course I'm talking only about the special case that doesn't hold up well logically. In conversion to boolean (explicit or implicit): 'Hello' yields TRUE, string is non-empty ' ' yields TRUE, string is non-empty '123' yields TRUE, string is non-empty '00' yields TRUE, string is non-empty (regardless of zero value) ' 0' yields TRUE, string is non-empty (regardless of zero value) '0 ' yields TRUE, string is non-empty (regardless of zero value) '0.0' yields TRUE, string is non-empty (regardless of zero value) '0x0' yields TRUE, string is non-empty (regardless of zero value) '' yields FALSE, string is empty '0' yields FALSE, even though string is non-empty, simply because of a single ASCII '0' ??? But wait, '0' is an alphanumeric string! PHP is now the only language in the world, web or otherwise, that would make an assumption about a string's NUMERIC value when converting to a BOOLEAN. It may have been more appropriate to follow other languages which only analyze the presence of CONTENT in the string. Am I at least making sense? Obviously you won't take the step to assuming '0.0' is false, or any of the silly ideas people have submitted as reports ('false'), but why take the initial step? By the same logic that you would argue '0.0' is an alphanumeric string and ' 0 ' is an alphanumeric string, and should not be interpreted as a boolean false, you should argue that '0' gets the same protection from coercion to a numeric value just for the boolean evaluation (and test). Of course, you could have deviated more drastically, and performed a numerical evaluation of every string, and see if it contains a zero number, and that would be incredibly inefficent, but it would follow your logic. If you do see my point, and have compared to other languages, you may see what I'm talking about. A change of this behavior would yield fewer errors by programmers that liked the C-, Java- and JavaScript-esque beauty of PHP but didn't catch it in the documentation (most programmers). However, I realize why you might not change the behavior -- existing code might make assumptions about user input that is intended as numerical and where tests against that numerical value are made. But realize those programs are already having to convert to integer anyway, because " 0" would be interpreted as non-zero. You see? I just wanted to comment, and hope that you recognize that this is unusual, and it leads to broken programs -- I know, I've fixed them! Thanks of course!