php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #64739 Invalid Title and Author data returned
Submitted: 2013-04-30 07:06 UTC Modified: 2017-07-22 21:08 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: php-qa at sebastianmendel dot de Assigned: kalle (profile)
Status: Closed Package: EXIF related
PHP Version: 5.4.14 OS: Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: php-qa at sebastianmendel dot de
New email:
PHP Version: OS:

 

 [2013-04-30 07:06 UTC] php-qa at sebastianmendel dot de
Description:
------------
exif_read_data returns invalid Titel and Author data.

ExifTool cli returns valid data.

Test script:
---------------
php-5.4.14 -r "var_dump(exif_read_data('http://sebastianmendel.de/php/exif_segmentation_fault/test.jpg'));"


Expected result:
----------------
  ....
  ["Title"]=>
  string(8) "55845364"
  ["Author"]=>
  string(13) "100420.000000"
  ....

Actual result:
--------------
  ....
  ["Title"]=>
  string(8) "????????"
  ["Author"]=>
  string(13) "?????????????"
  ....

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-04-30 15:35 UTC] cweiske@php.net
Reason is probably that the exif data were stored on a PowerPC Mac and have big endianess.
 [2016-08-05 06:29 UTC] kalle@php.net
@cweiske, you are right. This picture linked is stored in Motorola byte order which qualifies as big endian. This image has no makernote tag so I guess it was made from an editor (Paint.NET) and it also correctly identifies it as a motorola byte order in the COMPUTED section.

Of all the data returned, it indeed seems to only be the title and author that is somehow corrupted.
 [2017-07-07 10:28 UTC] kalle@php.net
-Status: Open +Status: Feedback
 [2017-07-07 10:28 UTC] kalle@php.net
Hi, could you please re-upload the image somewhere so I can use it for testing?

Thanks
 [2017-07-13 12:35 UTC] php-qa at sebastianmendel dot de
-Status: Feedback +Status: Open
 [2017-07-13 12:35 UTC] php-qa at sebastianmendel dot de
link http://sebastianmendel.de/php/exif_segmentation_fault/test.jpg restored
 [2017-07-13 23:15 UTC] kalle@php.net
Thank you Sebastian, from some quick debugging it seems that the Motorola/Intel offset is poorly detected, causing it to read it as such
 [2017-07-13 23:15 UTC] kalle@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: kalle
 [2017-07-14 01:37 UTC] kalle@php.net
I debugged it a little further, it seems that internally ext/exif detects the image byte order as Motorola, which causes it to use a wrong encoding to process the tags (from the WINXP section).

A temporary hack can be done by modifying the exif.decode_unicode_motorola ini directive:

[exif]
exif.decode_unicode_motorola=UCS-2LE

(It defaults to UCS-2BE)

I'm gonna look into possible approaches for solving this
 [2017-07-21 19:41 UTC] kalle@php.net
-Status: Assigned +Status: Closed
 [2017-07-21 19:41 UTC] kalle@php.net
I have given this some thought and attempts and trying to solve this in an automatic behavior, but its almost impossible with the current implementation of EXIF in ext/exif. Therefore I recommend to change the ini setting if the image contains another byte order than that of the default charset used to decode it with.

TL;DR for your image you would therefore do:
exif.decode_unicode_motorola=UCS-2LE

Apologies again, but I don't see any other clear way to go about this
 [2017-07-22 13:51 UTC] php-qa at sebastianmendel dot de
The problem is you never know which encoding is used and if you process a lot of pictures which are always changing - in a CMS - you will always loose.

Ok, so at least the manual page should have a hint about this.
 [2017-07-22 14:52 UTC] kalle@php.net
-Status: Closed +Status: Assigned -Type: Bug +Type: Documentation Problem
 [2017-07-22 14:52 UTC] kalle@php.net
I totally agree with you on this, possibly having to iterate charsets just to make it decode is non sense to me as well.

I will update the docs accordingly

Side note: I'm working on a new extension to see if it can replace the exif one at one point if you are interested, keep in mind its very basic: http://github.com/KalleZ/pecl-exifkit
 [2017-07-22 21:07 UTC] kalle@php.net
I have updated the docs for this
 [2017-07-22 21:08 UTC] kalle@php.net
-Status: Assigned +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 13:01:29 2024 UTC