php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #27238 iptcparse() function misses some fields
Submitted: 2004-02-13 06:27 UTC Modified: 2004-03-06 12:24 UTC
From: philip at nancarrow dot net Assigned: pajoye (profile)
Status: Closed Package: Feature/Change Request
PHP Version: 4.3.4 OS: Windows and Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: philip at nancarrow dot net
New email:
PHP Version: OS:

 

 [2004-02-13 06:27 UTC] philip at nancarrow dot net
Description:
------------
The iptcparse() function (GD extension) only returns IPTC/NAA records 2 and upward, skipping past record 1. This appears to be by design, but means that the returned data is incomplete, for example the "destination" dataset 1:05 is missing. Worse that this is the fact that "coded character set" (1:90) is missing, and without this value the encoding of the data is unknown (for example if 1:90 specifies ESC,%,G the data is UTF8 encoded). I assume that the current implementation is defaulting to ASCII or Latin1 encoding.
I can provide you with JPEG files containing IIM record 1 if required; they're quite common in the news industry.
Thank you



Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-02-13 09:29 UTC] pajoye@php.net
> I can provide you with JPEG files containing IIM record 1
> if required; they're quite common in the news industry.

Please do :)
If you can provide an URL with some images with the required fields and a txt file for the expected result.

Note that I never read the charset part in any docs about IPTC standart. Have you a link that describes it?

pierre
 [2004-02-13 10:40 UTC] philip at nancarrow dot net
Pierre,
OK sure, I've put two JPEGs that include IIM record 1 at:
http://www.nancarrow.net/download/testpic1_latin1.jpg
[Latin1 encoded English]
and
http://www.nancarrow.net/download/testpic2_utf8.jpg
[UTF8 encoded Chinese]

The IPTC/NAA (aka "IIM") spec is freely downloadable from http://www.iptc.org/download/download.php?fn=IIMV4.1.pdf and this details all records include record 1.

Appendix C lists the currently defined character sets, which is specified in dataset 1:90. Note the strange IPTC terminology - an "octet" is a byte, so "octet 2/5" means 0x25. The character set sequence starts with ESC, so where it says ISO-8859-1 is "intermediate character 2/12 to 2/15" followed by "octet 4/1" this would be something like:
ESC,0x2F,0x41
or "ESC/A". Similarly UTF8 is ESC,2/5,4/7 or "ESC%G".
Where the spec says "intermediate character 2/12 to 2/15" most creators writing the file use the end character, ie. 2/15 in this case.

I'm not sure that PHP really needs to know about the encoding, does it ? Since strings are just byte sequences in PHP I guess it's down to the application to do the appropriate encoding/decoding... as long as they have access to the character set of course !

Thanks
Philip
 [2004-03-06 12:24 UTC] pajoye@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Nov 26 18:01:33 2024 UTC