php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #22407 Unix formatted text file parsed with DOS characters interpreted
Submitted: 2003-02-24 17:29 UTC Modified: 2003-03-09 19:12 UTC
From: dci at webquill dot com Assigned:
Status: No Feedback Package: Scripting Engine problem
PHP Version: 4.2.3 OS: FBSD 4.6-RELEASE
Private report: No CVE-ID: None
 [2003-02-24 17:29 UTC] dci at webquill dot com
I did note something similar in bug #10858, but that was in regards to an older version of PHP and it seemed like the feedback was never provided to solve the issue.

We've got code that is in a Unix formatted text file, and on one of the lines, we somehow ended up with a literal ^M character (a DOS newline feed).  Although this had nothing to do with PHP, it produced very strange results -- the line in question began with a single line comment ( // ), and the ^M was in the middle followed by an old if statement.  I suspect this came about because our code was originally a quick conversion from an ASP site utilizing asp2php.  The odd part, however, is that even tho editors (vim, emacs) would show the line as being a comment, and on one line, PHP would interpret the ^M as a newline, and then parsed the code after it.  Because this was a comparison between 0 and a variable that did not exist (thus returning 0), the new comparison for the if, which was on the next line, was never processed.  Here is some example code (I am afraid I cannot provide full-code due to restrictions placed on me by my employer):

  error_log( "SQL: $ssql" );
  $res = pg_query( $dbconn, $ssql );
  error_log( "pg_last_error: " . pg_last_error($dbconn) );
  error_log( "Number of results: " . pg_num_rows($res) );

  //only send if there's an Email to send to^M  if (!($get_email==0))
  if( pg_num_rows($res) > 0 ){
    error_log( "here" );

The line beginning with // and ending with ($get_email==0)) is all one line.  The first error_log showed the correct SQL, pg_last_error reported to errors, and pg_num_rows correctly logged 1 result row.  We discovered the problem when the error_log("here") statement was never being executed.  Upon removing the ^M character, everything performed as expected, and the "here" statement was logged in the error_log.  The following code was then executed.

Now I realize that this could potentially be by design, because of the cross-platform nature of web development (due to all those silly Windows users out there and their bad character set ;), but I would suspect that if the entire file is being parsed as a particular format text file that characters not indicating a linefeed in that format should not be interpreted as such.  It is my guess (without taking time to examine the PHP source code) that PHP is not examining the file type and is merely interpreting all linefeed characters as new lines and not determining file format.

A thought after finally tracking down a very odd problem..  Other than that, thank you for a wonderful product!

-chris

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-03-09 19:12 UTC] sniper@php.net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Open". Thank you.


 [2009-12-16 11:11 UTC] nileshkumar dot patel at alcatel-lucent dot com
HI there 

i am facing same problem. We have a data file in XML thats in UNIX format. We wrote an application in PHP for windows. When we try to parse this XML file using PHP functions on windows it detect each tag twice. While i copied the contents in to the XML file in windows format it worked fine. So is it a bug in PHP or i need to use some setting while parsing XML?

Please help me as it is an urgent issue

Thanks
Nilesh
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Oct 06 16:01:26 2024 UTC