php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #24750 Heredoc-Syntax
Submitted: 2003-07-22 07:03 UTC Modified: 2010-11-18 23:31 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: ssilk at fidion dot de Assigned: jani (profile)
Status: Closed Package: Scripting Engine problem
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: ssilk at fidion dot de
New email:
PHP Version: OS:

 

 [2003-07-22 07:03 UTC] ssilk at fidion dot de
Description:
------------
I read the bug #13610 and other depending stuff and I think the core of the problem is, that heredoc-syntax is some kind of confusing, because of the sensitivity for whitespace-characters (Space, Tab, CR, LF...) (and perhaps the fixating for the OS-newline-character).


Here are some examples of realy time-consuming error-searches:

// example #1
echo <<<heredoc 
There is a SPACE between
the identifier and the end of line;
PHP thinks now, the identifier is
"heredoc " (with space at the end)
and not "heredoc" as assumend.
heredoc;
// fails, instead PHP thinks
// it is still inside the heredoc
// so it prints out very funny error messages.
//
// Here is a tip for the documentation:
//
// This case could be tested easily if you type something
// looking like a "array-var" like $a['b']
// If PHP says there is an error in the line above
// then you have the "whitespace-identifier-problem".
// (the ' is important here! Perhaps there are some
// other testcases)

// example #2
echo <<<heredoc
Now we made all right above, but
for more readable code we added a space
before the semicolon at the end:
heredoc ;
// same problem as above, but the other way around. :)

This problem is not very obvious to even experienced programmers.

So currently a developer must try to avoid any whitespaces in conjunction with heredoc-identifiers!

I suggest here (see bug #5804 at the end, feature request by Hartmut) to change the parser to stop parsing the identifier if it finds a whitespace. I think this avoids many problems with it and I think the current implementation provokes missinterpretation.

Perhaps - to enable more complicated stuff - adding a new heredoc-syntax like 

echo <<<<tag-like-syntax...>This is a tag-like syntax...
The end of the string 
is marked with <\/tag-like syntax...>
(the above \\/-escape sequence
must be added to this routine). 
The end can exist in the same line
and not only at the
begin of a line</tag-like-syntax...> ; // end of string

Think, this kind of heredoc is very "readable", cause every PHP-user knows what a tag is and how it works. This also comes into some direction to Pearl (the ugly q-syntax, I realy hate it :), but this is some more nicer.

Perhaps some special ideas here. Just implementing tag-like strings, it could be useful to also add arguments:

      echo <<<<string ignore_indent=1 strip_nl>
      This enables you to indent the string,
      but in the output the indenting is stripped away.
      Also it deletes the newlines at the begin and
      end of the string.
      </string>

Some more (very useful) arguments are thinkable.

A more intelligent error handling in cases, when inside a heredoc-string an error occours, could also help a lot (printing out the found identifier within quotes) or a error-message, that the awaited identifier isn't valid or something like this kind.

All in all heredoc is very useful. See also the useful feature request in Bug #8685 ('<<-'-syntax).




Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-07-22 08:37 UTC] ssilk at fidion dot de
In addition to above the arguments in the tag-like syntax be very useful writing queries:

$hu="a string with singlequotes '<- here"
$go="More singlequotes '''''";

...query(<<<<sql strip_nl ignore_indent addslashes>
     SELECT * FROM table
     WHERE bla = '$hu' and blubb='$go'
     </sql>);

Which should be obviously useful.

[An abbreviation for the above could be
...query(<<<<sql sqlquery>
     SELECT * FROM table
     WHERE bla = '$hu' and blubb='$go'
     </sql sqlquery>);

or perhaps also

...query(<<<<sqlquery> ...</sqlquery>)

which makes this tag to a special tag.

For those who don't like inline vars in a query:

...query(<<<<sql some other arguments>
     ($hu, $go)
     SELECT * FROM table
     WHERE bla = '?' and blubb='?'
     </sql sqlquery>);

Other guys are better in finding a useful syntax.

]

I can imagine many cases, where other arguments can change the behaviour in some way that matches a special kind of API. For example regex/pcre, urlencode, dates, printf, print_r etc. Can be extended nicely and upward compatible to the needs we have.

Inflation of features is a thing which must be carefuly watched.

Another aspect is, that documentation of some functions are easier. Take for example preg_match():


Current:
preg_match('/\\\\\?/',$var);
// match '\?' - nearyl unreadable and hardly to understand

With tagged quoting:

preg_match(<<<<pcre>/\\?/</pcre>,$var);
// much more readable :-)
 [2009-11-19 11:48 UTC] RQuadling at GMail dot com
This bug has been fixed.

<?php
echo <<< END_HEREDOC_WITH_A_TRAILING_SPACE 
1
2
3
END_HEREDOC_WITH_A_TRAILING_SPACE;

echo 1 / 0;


outputs...

Parse error: parse error in ... on line 2


The trailing space triggers the parse error.

Removing the trailing space and the output is, as expected ...

1
2
3
Warning: Division by zero in ... on line 8
 [2010-11-18 23:31 UTC] jani@php.net
-Status: Open +Status: Closed -Package: Feature/Change Request +Package: Scripting Engine problem -Assigned To: +Assigned To: jani
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Apr 20 04:01:28 2024 UTC