php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #71598 Whitespace at the start of a function name counts as part of the function name.
Submitted: 2016-02-15 12:48 UTC Modified: 2016-02-15 13:16 UTC
Votes:1
Avg. Score:2.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: lucas at lucassifoni dot info Assigned:
Status: Open Package: Scripting Engine problem
PHP Version: 7.0.3 OS: Debian 8
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2016-02-15 12:48 UTC] lucas at lucassifoni dot info
Description:
------------
According to the PHP doc, «A valid function name starts with a letter or underscore, followed by any number of letters, numbers, or underscores.»

Some whitespace characters, like U+2000, U+2001, U+2002, U+2003, U+2004 [...], and the zero-width non breaking space (now word joiner, U+FEFF), are interpreted as a part of a function name.

If this isn't a problem with a proper code editor that highlights "special" whitespace, it is still conflicting with the doc, and can be really confusing to users.

The given snippet has a zero-width nonbreaking space just in front of foo(). When invoking foo(), the function is undefined, since the space is part of its referenced name.

Test script:
---------------
<?php

function foo(){
    echo 'bar';
}

foo();

Expected result:
----------------
bar

Actual result:
--------------
Fatal error: Uncaught Error: Call to undefined function foo() in [...][...]:7 Stack trace: #0 {main} thrown in [...][...] on line 7 

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-02-15 13:16 UTC] requinix@php.net
-Type: Bug +Type: Documentation Problem -Package: PHP Language Specification +Package: Scripting Engine problem
 [2016-02-15 13:16 UTC] requinix@php.net
It then goes on to specify a regex: [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*

Given that files must be saved in an encoding, your zero-width non-breaking space was probably encoded using UTF-8, thus your function is named "\xEF\xBB\xBFfoo" - which is valid, though terribly inconvenient to write in actual code.

Rather than try to make PHP Unicode aware, something which has proven to be difficult in the past, for now perhaps we can update that one sentence to mention the high-byte range and how it consequently allows most non-ASCII characters (for better or worse) in names.
 [2016-06-27 11:47 UTC] lkppo at free dot fr
I can reproduce the same problem with a variable substitution:

<?php

$aaa = "abcdefgh";
$bbb = " $aaa ";

?>

Inserting a non breaking space just after $aaa raises an error. Since most HTML pages are encoded in UTF-8, it is disconcerting that PHP avoid Unicode. Unicode will be hard but you can't avoid it, it is ubiquitous.
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Sun Jan 24 19:01:22 2021 UTC