Exif and Character Sets

17 views
Skip to first unread message

Loren M. Lang

unread,
Sep 3, 2010, 8:37:22 PM9/3/10
to Life Logging

I was working on setting several text fields in the Image Exif data and ran into a few problems.  One issue was centered around me trying to use a literal copyright symbol (©) in the Copyright Exif Tag.  The symbol was getting corrupted so I pulled up the Exif 2.2 spec and was disappointed.  Exif does not really support textual fields outside of 7-bit ASCII.  In particular, the Image Title, Copyright, Make, Model, Software, and Artist fields are limited to 7-bit ASCII.  There is no support for accented characters for the Author's name, much less a Chinese name or a copyright symbol.  With Exif 2.1 and 2.2 being a 1998 and 2002 standard, respectively, I'm surprised that there's no real support for internationalization.  According to ExifTool's website, some image software sometimes fill text using UTF-8 or other legacy encoding, but since there is no way to confirm what encoding was used, it's an unreliable way to store it.

Loren M. Lang

unread,
Sep 3, 2010, 8:39:42 PM9/3/10
to Life Logging
My previous message was truncated.

unreliable way to store it. Now the UserComment Exif tag is marked as a
private tag and has type undefined instead of ASCII. The Exif spec
defines a marker that's placed at the beginning of the data that marks
it as one of ASCII, Unicode, JIS (Japanese), or Binary. It sounds like
it was an add-on to the spec and originally the tag was just to store
user-defined binary data for storing whatever they need in it. It's the
only tag which mentions any kind of Unicode support. Also, I've
confirmed that the Exif 2.2 spec has no support for UTC or time zone
offsets in it's timestamps (except for GPSTimeStamp) and it's simply in
some undefined local time.

I ran across XMP which seems like a nicer metadata format which can be
embedded in JPEG along with Exif metadata. It uses RDF which uses XML
for storage and therefore supports Unicode for all strings. It also
supports zone offsets or UTC for all Date fields. Lastly, it's also
more extensible, just add your own XML namespace for custom data. It
also supports other formats like JP2 (JPEG 2000), TIFF, PNG, GIF, MIFF,
PS, PDF, PSD and DNG and can even be embedded in ID3v2 tags in a MP3.
Exif is not supported for JPEG 2000, PNG or GIF according to Wikipedia.

http://en.wikipedia.org/wiki/Extensible_Metadata_Platform

Image::ExifTool can read and write XMP.

--
Loren M. Lang
lor...@north-winds.org
http://www.north-winds.org/


Public Key: ftp://ftp.north-winds.org/pub/lorenl_pubkey.asc
Fingerprint: 10A0 7AE2 DAF5 4780 888A 3FA4 DCEE BB39 7654 DE5B

Reply all
Reply to author
Forward
0 new messages