|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2005-05-21 18:40 UTC] pmjones@php.net
Description:
------------
It appears that token_get_all() does not report T_OPEN_TAG and T_WHITESPACE properly, depending on the whitespace following the opening tag. For example, when parsing ...
<?php echo $var ?>
... you get T_OPEN_TAG, T_ECHO, T_WHITESPACE, T_VAR, T_WHITESPACE, and T_CLOSE_TAG. This is not entirely the expected result (I would expect T_WHITESPACE between the open tag and the echo).
However, when parsing the functional equivalent...
<?php
echo $var
?>
you get "<", "?", T_STRING ("php"), T_WHITESPACE, T_ECHO, T_WHITESPACE, T_VAR, T_WHITESPACE, and T_CLOSE_TAG. In addition, the first whitespace value reported does not include all the newlines (it drops one).
Although Macs use \r for their newlines natively, the test code uses the Unix-standard \n, so I don't think it's Mac-related.
If this is in fact a bug, the current behavior makes it difficult to write a reliable userland code auditor and report proper line numbers.
Am I missing some assumptions behind the behavior of the tokenizer function?
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Oct 29 17:00:02 2025 UTC |
wheres the missing data? php -r 'var_dump(token_get_all("<?php echo \$var ?>"));' array(6) { [0]=> array(2) { [0]=> int(366) [1]=> string(6) "<?php " } [1]=> array(2) { [0]=> int(316) [1]=> string(4) "echo" } [2]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [3]=> array(2) { [0]=> int(309) [1]=> string(4) "$var" } [4]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [5]=> array(2) { [0]=> int(368) [1]=> string(2) "?>" } } php -r 'var_dump(token_get_all("<?php \necho \$var\n?>"));' array(7) { [0]=> array(2) { [0]=> int(366) [1]=> string(6) "<?php " } [1]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [2]=> array(2) { [0]=> int(316) [1]=> string(4) "echo" } [3]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [4]=> array(2) { [0]=> int(309) [1]=> string(4) "$var" } [5]=> array(2) { [0]=> int(369) [1]=> string(1) " " } [6]=> array(2) { [0]=> int(368) [1]=> string(2) "?>" }The second command-line test should have pairs of \n newlines, not singles. A corollary issue is that the results on the same code are inconsistent. Sometimes my token_get_all() returns the expected result (T_OPEN_TAG) and sometimes an unexpected result ("<", "?", T_STRING of "php"). Could there be a reason for the engine being "finicky"?