php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66156 date_create_from_format() fails for "!ndY" format
Submitted: 2013-11-23 06:15 UTC Modified: 2013-11-24 18:27 UTC
From: mjpelmear at gmail dot com Assigned:
Status: Not a bug Package: Date/time related
PHP Version: master-Git-2013-11-23 (Git) OS: Any
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: mjpelmear at gmail dot com
New email:
PHP Version: OS:

 

 [2013-11-23 06:15 UTC] mjpelmear at gmail dot com
Description:
------------
date_create_from_format() improperly parses dates in "!ndY" format, but reports the problem as being with the data, not with the parser's [lack of] support for the format.

This date format is a strange one, but we encountered it in data files received from a client, so it seems to be in use, albeit undesirable.

Analysis:
---------
The root issue is that the parser is reading the incoming string from left to right, but when doing so it has no way to determine whether the month is one or two digits (since "n" means the month would be a single digit if < 10).
See timelib_parse_from_format() in ext/date/lib/parse_date.c.
Note that while it would be impossible to support a format like "!njY" because this would have some ambiguous cases ("1112013" for example), it is at least possible to support "!ndY" and similar cases.

Test script:
---------------
$date = date_create_from_format( '!ndY', '3071989' );
assert( $date->format('Y-m-d') == '1989-03-07' ); // assertion fails.
print_r(DateTime::getLastErrors());
echo PHP_EOL;
print_r($date);
echo PHP_EOL;

Expected result:
----------------
The assertion should succeed, or we should receive a meaningful error message indicating that this date format isn't supported.

Actual result:
--------------
The assertion fails (DateTime shows the date as "0991-08-10", in Y-m-d format).
DateTime::getLastErrors() simply indicates that the parsed date was invalid ("The parsed date was invalid").

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-11-23 06:20 UTC] mjpelmear at gmail dot com
If someone wants to provide feedback as to which of the following is preferable to fix this problem, I would be happy to make a pull request to fix it:
1) We should simply provide an appropriate error message.
or
2) We should implement some sort of lookahead to actually handle this type of format.
or
3) Something else?
 [2013-11-23 16:46 UTC] salathe@php.net
It looks like the date string is taken to be (in Y-m-d format for presentation) 0989-30-71, which when wrapped around into a “normal” date gives the mentioned 0991-08-10. 

The string is broken up as follows:
“n” wants a two-digit month number, it accepts “30”.
“d” wants a two-digit day of month number, it accepts “71”.
“Y” wants a four-digit year number, it accepts “989”.

The warning is generated when “30” and “71” are checked for being “valid" month and day of month numbers, respectively.

Internally, in many cases, there is no difference between using the leading-zero format character and the non-leading zero one. E.g. “n” and “m” are extracted using the same internal code. These are described in the documentation already.

Your claim that this date format is not supported is incorrect. It is supported; it simply doesn’t behave as you were expecting. Should format characters like “n” and “m” give different outcomes for this function? Perhaps, but that’s opening the time string parsing up to a not insignificant re-write. Should the string parsing try to be a little more “clever” by attempting to get a “valid” date from the multiple possible combinations in ambiguous date strings? Perhaps, but do we want to introduce this complexity (and we’ll never get it right 100% of the time)?  I don’t have an answer.

In your particular case, I can only suggest manipulating the date string such that it is no longer ambiguous; perhaps by adding appropriate delimiter characters between the different date parts.
 [2013-11-24 03:43 UTC] mjpelmear at gmail dot com
@ salathe@php.net:
I have to strongly disagree with you here.

1) The documentation (http://us3.php.net/manual/en/datetime.createfromformat.php) is vague about "n" expecting a month with no leading zero, but it implies that "m" is for a month with a leading zero and that "n" is for a month without a leading zero. Furthermore, for the date() method, the documentation clearly states that "n" is "Numeric representation of a month, without leading zeros". It seems pretty clear to me that it ought to be exactly that in all cases, regardless of how it currently works internally. Otherwise, we should adjust the documentation to reflect that "n" and "m" both represent a month with a leading zero. I think you'd agree that would be a bad idea given how they are used elsewhere (i.e., for formatting dates as strings with date())

2) To be clear, I'm not saying the format isn't supported. Clearly it is or I would have received a more severe error message. The point is that it is not supported *correctly*. Consider this example:
$dt = new DateTime( '2013-12-01' );
$date = date_create_from_format( '!ndY', $dt->format('ndY') );
assert( $dt == $date ); // passes
$dt = new DateTime( '1989-03-07' );
$date = date_create_from_format( '!ndY', $dt->format('ndY') );
assert( $dt == $date ); // FAILS

Are you going to tell me that is an acceptable outcome?
You said it simply doesn't behave as I expected, but one would only expect it to be broken like this if they had read the underlying implementation (like I did) and found that it only works from left to right. The documentation does not mention this little tidbit.

3) I understand that it would require significant effort to alter the algorithm to take these types of cases into account. That's why I suggested that another alternative would be to store an error indicating that the format doesn't work properly, so that at if it's not going to work the way one would expect we would at least get feedback about that.

4) It was rather elementary to solve this issue (basically in the way you described) when we encountered it some months ago. I am just trying to fix this so that the next person doesn't run into this and have the same confusion that my team encountered. Our code used a dynamically entered php date format, so it required a hacked workaround to support this specific case when it should have just worked.

5) Why would you ever indicate that we might never get something right? Shouldn't we be striving for bug-free code here? Especially when someone from the community has already offered to do the work-- whatever that work may be. Granted, it is difficult to get complicated code to act in a sane manner 100% of the time, but that is no excuse for not trying to do so.

6) My frustration here is not that the problem exists. Bugs happen. People fix them. My frustration is that someone with an @php.net email address would think this is a non-issue.
 [2013-11-24 18:27 UTC] derick@php.net
-Status: Open +Status: Not a bug
 [2013-11-24 18:27 UTC] derick@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

If you pass in an ambiguous string for a format, there is nothing we can do. Is 1112013 supposed to be 1-NOV-2013 or 11-JAN-2013 with a format of "!jnY"? 
Chances are that you can't tell us either. Or what about 1092013 with "!jnY" - that could also be both 1-SEP-2013 or 10-SEP-2013?

We *could* make n non-greedy, and m greedy, but that still does not solve the 1112013 case—it would just pick 1-NOV-2013 - even if you meant 11-JAN-2013.

Passing in an ambiguous string is not something that we can ever support for 100% - it only works properly if you add delimiters.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu May 02 16:01:29 2024 UTC