Issue 113 in mp4v2: Support for tags in the "Xtra" atom used by Windows Media Player

87 views
Skip to first unread message

mp...@googlecode.com

unread,
Aug 5, 2011, 2:17:08 PM8/5/11
to mp...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 113 by tdebaets: Support for tags in the "Xtra" atom used by
Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

Since version 12 in Windows 7, Windows Media Player supports the MPEG-4
format, including tags. For most tags, WMP uses the standard iTMF metadata,
but for some other tags, like publisher, encoding time and volume leveling
info, it uses an undocumented "Xtra" atom. A sample M4A file that has a few
of these tags stored in such an Xtra atom is attached.

I'm the author of the WMP Tag Plus plug-in for Windows Media Player and my
plug-in adds MPEG-4 tag support to WMP version 11 as well. I would like to
extend that support while keeping compatibility with WMP 12's MPEG-4 tag
support, which means that I will also have to use the Xtra atom. Because
I'm already using the MP4v2 library, modifications to this library are
therefore required.

A user of WMP Tag Plus, ErikSka, was able to reverse engineer the format of
the Xtra atom and has already modified MP4v2 to add support for tags in
this atom. He has sent me his modified version, allowing me to freely
use/distribute it. Because his version wasn't C-compatible yet, I had to
change quite some things, and I also added an "mp4xtratags" util, similar
to the existing mp4tags util.

The resulting patch with all these modifications is attached. Perhaps this
patch could be applied to the SVN repository? The only defect is that the
mp4xtratags util currently only compiles on Windows because it relies on
rpc.h for displaying/parsing GUIDs, though it shouldn't be too hard to add
POSIX support.

For more details, see ErikSka's post in the WMP Tag Plus support thread:
http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=75123&view=findpost&p=757335

Attachments:
silence_xtra.m4a 3.2 KB
mp4v2_xtratags.patch 36.5 KB

mp...@googlecode.com

unread,
Aug 5, 2011, 2:50:34 PM8/5/11
to mp...@googlegroups.com
Updates:
Status: Accepted
Owner: kid...@gmail.com
Labels: -Type-Defect Type-Enhancement

Comment #1 on issue 113 by kid...@gmail.com: Support for tags in the "Xtra"

Hey tdebaets,

Thanks for the patch--looks very nice ;) And I'm sure other people could
definitely use this. I'll review and patch this weekend, assuming all
looks good.

mp...@googlecode.com

unread,
Aug 8, 2011, 10:59:22 AM8/8/11
to mp...@googlegroups.com

Comment #2 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

Great, thanks :) I see that the patch hasn't been applied yet though - did
you find any issues?

mp...@googlecode.com

unread,
Aug 8, 2011, 3:12:36 PM8/8/11
to mp...@googlegroups.com

Comment #3 on issue 113 by kid...@gmail.com: Support for tags in the "Xtra"

No, unfortunately I didn't have any free time this weekend, so...I didn't
get around to it. I browsed the patch briefly. One thing I'd like to see
is documentation for any public APIs, just so it's clear how someone would
use it.

mp...@googlecode.com

unread,
Aug 8, 2011, 3:42:05 PM8/8/11
to mp...@googlegroups.com

Comment #4 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

OK, no problem. I will write some documentation for the new public APIs and
will submit a new patch somewhere this week.

mp...@googlecode.com

unread,
Aug 12, 2011, 2:25:39 PM8/12/11
to mp...@googlegroups.com

Comment #5 on issue 113 by kid...@gmail.com: Support for tags in the "Xtra"

So, sorry for the delay. Been looking at this yesterday and today. There
was no VS solution updates which broke the VS build because it modified
existing files. Also you didn't add the new command line project to the VS
solution. I'll fix both of those when I check this in.

Also, why is this stuff commented out in the public interface? Do you
intend to add these at some future point, or is this stuff that can be
zapped for now?

//MP4V2_EXPORT const MP4XtraTags* MP4XtraTagsAlloc();
//MP4V2_EXPORT void MP4XtraTagsFetch( const MP4XtraTags* tags,
MP4FileHandle hFile );
//MP4V2_EXPORT void MP4XtraTagsStore( const MP4XtraTags* tags,
MP4FileHandle hFile );
//MP4V2_EXPORT void MP4XtraTagsFree ( const MP4XtraTags* tags );

Also, help me understand how this fits in with the existing itmf stuff; do
they overlap at all? Do they compliment one another?

Thanks for the patch--it'll get in some time today.


mp...@googlecode.com

unread,
Aug 12, 2011, 3:52:41 PM8/12/11
to mp...@googlegroups.com

Comment #6 on issue 113 by kid...@gmail.com: Support for tags in the "Xtra"

So, initially I checked in in r484, but on closer inspection I decided to
revert the changes in r485. There's some stuff that needs to be fleshed
out before this can go in.

There are two major issues:

1. First, it's not clear to me what the string format is for metadata
items; my guess is UTF8, but this should be tested and made clear in the
API. So it needs to be tested with some metadata name containing non-ASCII
characters.

2. The second, much bigger issue is how strings are persisted. I see the
application is serializing wchar_t, which is a major problem for cross
platform support. Windows treats wchar_t as UTF-16 (so two bytes per
character), while Linux and OSX treat wchar_t as UTF-32 (4 bytes per
char). So working on the same platform would be fine, but if I attempted
to write string data on Linux and then read it on windows (or vice-versa),
it would fail because the character encoding differs. The iTmf stuff
avoids this by using UTF8 everywhere.

Lastly, this is an undocumented, reverse engineered MSFT atom...and I have
to wonder what sort of example we'd be setting by supporting such an atom
in the first place. I'm still open to accepting the patch, assuming #2 is
resolved (and #1 clarified) and you can successfully use the code across
platforms. And let me know if I'm wrong about any of this (certainly
wouldn't be the first time). I do really appreciate the patch, but I can't
accept code that isn't going to work cross platform.

I'll leave this open until you respond.

mp...@googlecode.com

unread,
Aug 13, 2011, 10:45:07 AM8/13/11
to mp...@googlegroups.com

Comment #7 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

> Also, why is this stuff commented out in the public interface? Do you


> intend to add these at some future point, or is this stuff that can be
> zapped for now?

These were already in ErikSka's modified version and provide (unfinished)
support for retrieving all metadata items in the Xtra atom. Not really
needed for now because one will usually know the exact name of the metadata
item he wants to get, so I'll leave these out.

> Also, help me understand how this fits in with the existing itmf stuff; do
> they overlap at all? Do they compliment one another?

As this is a totally new way for storing metadata, it's completely separate
from the iTMF stuff - there's no overlap.

> 1. First, it's not clear to me what the string format is for metadata
> items; my guess is UTF8, but this should be tested and made clear in the
> API. So it needs to be tested with some metadata name containing
> non-ASCII
> characters.

Do you mean the name of the metadata items (e.g. WM/EncodingTime,
WM/Publisher)? This is unfortunately impossible to test, because WMP only
writes its predefined attributes to file and there isn't any WMP attribute
whose name contains non-ASCII characters. I would just say in the API that
these names should be plain ASCII strings.

> 2. The second, much bigger issue is how strings are persisted. I see the
> application is serializing wchar_t, which is a major problem for cross
> platform support. Windows treats wchar_t as UTF-16 (so two bytes per
> character), while Linux and OSX treat wchar_t as UTF-32 (4 bytes per
> char). So working on the same platform would be fine, but if I attempted
> to write string data on Linux and then read it on windows (or vice-versa),
> it would fail because the character encoding differs. The iTmf stuff
> avoids this by using UTF8 everywhere.

I see what you mean. Using UTF8 just like the iTMF API is probably the way
to go. Because the metadata value strings are stored as UTF16 in the Xtra
atom, a conversion would then be required while reading/writing. I searched
for some cross-platform code that performs this conversion and found this:
https://trac.transmissionbt.com/browser/trunk/libtransmission/ConvertUTF.h .
Is it OK if I include this in MP4v2? It also seems to be used by a lot of
other open source projects.

> Lastly, this is an undocumented, reverse engineered MSFT atom...and I have
> to wonder what sort of example we'd be setting by supporting such an atom
> in the first place.

Unlike the other issues, there's nothing that I can do about that - I can't
force MS to document the atom, sadly :p But I still think that this patch
would be a nice addition to MP4v2. Extra support is always good and it's
not *that* much code, so the increase in library size is relatively small.

mp...@googlecode.com

unread,
Aug 16, 2011, 4:11:59 PM8/16/11
to mp...@googlegroups.com

Comment #8 on issue 113 by kid...@gmail.com: Support for tags in the "Xtra"

As it stands, mp4v2 already has some mild license problems (see issue 81),
so I'm not sure how I feel about adding yet another license into the fray
(unisys license). Is there no public domain option for doing UTF*
conversion?

mp...@googlecode.com

unread,
Aug 16, 2011, 4:23:10 PM8/16/11
to mp...@googlegroups.com

Comment #9 on issue 113 by dbyr...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

Is Utf8ToFilename in libplatform/platform_win32_impl.h not sufficient? I
suppose not the entire class is relevant if we're not talking about
filenames, but the ConvertToUTF16 method might be. I'm happy to contribute
UTF16 --> UTF8 conversion routines as well.

mp...@googlecode.com

unread,
Aug 16, 2011, 5:29:30 PM8/16/11
to mp...@googlegroups.com

Comment #10 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

@kidjan: I might be wrong, but aren't ConvertUTF.h and ConvertUTF.c already
public domain? I certainly don't see/recognize any license in their notice
at the beginning. I also found this but the usage seems more awkward:
http://utfcpp.sourceforge.net .

@dbyron: That ConvertToUTF16 method looks alright. Note that the conversion
is also necessary on Linux and OSX, so that means that the method would
have to be moved outside of the Win32-specific platform code.

mp...@googlecode.com

unread,
Aug 23, 2011, 12:44:49 PM8/23/11
to mp...@googlegroups.com

Comment #11 on issue 113 by kid...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

They use a unicode license:

8 /*
9 * Copyright 2001-2004 Unicode, Inc.
10 *
11 * Disclaimer
12 *
13 * This source code is provided as is by Unicode, Inc. No claims are
14 * made as to fitness for any particular purpose. No warranties of any
15 * kind are expressed or implied. The recipient agrees to determine
16 * applicability of information provided. If this file has been
17 * purchased on magnetic or optical media from Unicode, Inc., the
18 * sole remedy for any claim will be exchange of defective media
19 * within 90 days of receipt.
20 *
21 * Limitations on Rights to Redistribute This Code
22 *
23 * Unicode, Inc. hereby grants the right to freely use the information
24 * supplied in this file in the creation of products supporting the
25 * Unicode Standard, and to make copies of this file in any form
26 * for internal or external distribution as long as this notice
27 * remains attached.
28 */

...definitely a very permissive license, but not public domain. And
another licenses we'd have to adhere to, and end-users would have to comply
with when distributing MP4v2. That sourceforge project is also not in the
public domain; it uses some unnamed derivative license.

@dbyron0,

Big issue is running on *nix and OSX, so you'd need conversion routines.
This actually is somewhat of a problem, given:

wchar_t *
Utf8ToFilename::ConvertToUTF16 ( const string &utf8string )
{
...
ASSERT(sizeof(wchar_t) == 2);

It's true on windows, but not true on Linux or OSX.

mp...@googlecode.com

unread,
Aug 24, 2011, 10:43:04 AM8/24/11
to mp...@googlegroups.com

Comment #12 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

@kidjan: I hear you about the license problems. I have found other code
that's public domain and that doesn't look too bad:
http://altdevblogaday.com/2011/03/30/putting-the-you-back-into-unicode .
The only problem is the complete lack of docs, but if I understand
correctly, conversion first requires calling one of the "Measure" functions
to determine the size of the destination buffer, allocating this buffer,
and then calling the right "Transcode" function. Does this look OK for
including into MP4v2?

mp...@googlecode.com

unread,
Aug 25, 2011, 10:54:29 PM8/25/11
to mp...@googlegroups.com

Comment #13 on issue 113 by dbyr...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

You're right that ConvertToUTF16 only works as it is on windows.
Converting from UTF8 to UTF16 just before saving seems reasonable. Making
a version that uses an array of four bytes instead of an array of two
wchar_t's seems a simple enough change...though maybe using an array of
four bytes makes more sense on all platforms at this level. The filename
stuff could stick with wchar_t on Windows and know how to build those from
(up to) four bytes representing UTF-16.

Am I right that all of this Xtra stuff uses (or can use) UTF-8 everywhere
in memory and it's only on disk that it needs UTF-16?

mp...@googlecode.com

unread,
Aug 25, 2011, 10:58:32 PM8/25/11
to mp...@googlegroups.com

Comment #14 on issue 113 by dbyr...@gmail.com: Support for tags in

mp...@googlecode.com

unread,
Aug 26, 2011, 2:09:09 PM8/26/11
to mp...@googlegroups.com

Comment #15 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

> Am I right that all of this Xtra stuff uses (or can use) UTF-8 everywhere


> in memory and it's only on disk that it needs UTF-16?

Correct. It currently doesn't use UTF-8 yet, but that's the plan.

mp...@googlecode.com

unread,
Sep 5, 2011, 9:54:31 PM9/5/11
to mp...@googlegroups.com

Comment #16 on issue 113 by kid...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

dbyron,

I'm no unicode guru, but I'm not sure using an array of two wchar_t would
work; on OSX and Linux, it's UTF-32, not UTF-16. UTF-16 is variable
length; UTF-32 is not. It's two completely different character encodings,
so basically the code would have to be able to handle UTF-32.

mp...@googlecode.com

unread,
Sep 6, 2011, 10:16:06 AM9/6/11
to mp...@googlegroups.com

Comment #17 on issue 113 by da...@onmenetwork.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

I don't think UTF-32 would ever come into play. I figure we've got UTF-8
in memory and only when we need to use persistent storage to we ever
convert to/from UTF-16. That conversion may not even need to be platform
specific if we keep wchar_t out of it and use an array of four bytes to
hold the UTF-16 encoding instead.

I should add that I'm super busy at the moment and don't have time to spend
on this likely til October...


mp...@googlecode.com

unread,
Sep 6, 2011, 3:11:07 PM9/6/11
to mp...@googlegroups.com

Comment #18 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

> I should add that I'm super busy at the moment and don't have time to

> spend
> on this likely til October...

Not sure if the UTF-8 to/from UTF-16 conversions really need to be written
from scratch. What about the public domain code at
http://altdevblogaday.com/2011/03/30/putting-the-you-back-into-unicode?

mp...@googlecode.com

unread,
Sep 6, 2011, 3:15:08 PM9/6/11
to mp...@googlegroups.com

Comment #19 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

mp...@googlecode.com

unread,
Sep 6, 2011, 3:25:16 PM9/6/11
to mp...@googlegroups.com

Comment #20 on issue 113 by da...@onmenetwork.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

It'd be great to get it from elsewhere, though it might be a little strange
to have two different hunks of code for encoding conversion in mp4v2.

More of an issue is that the downlink link I found there
(http://studiotekne.com/downloads/Unicode.zip) has the test routines, but I
couldn't find the conversion routines themselves. I could easily be
looking in the wrong place.

Plus also, I'd like to make sure whatever code we use handles all the weird
invalid cases here:
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt


mp...@googlecode.com

unread,
Sep 6, 2011, 3:40:49 PM9/6/11
to mp...@googlegroups.com

Comment #21 on issue 113 by tdebaets: Support for tags in the "Xtra" atom

In that archive, the conversion routines are in
Unicode\tests\Unicode.Test\Unicode.h.

But no problem if you really want to write it from scratch, there's no
hurry :)

mp...@googlecode.com

unread,
Oct 31, 2011, 3:19:06 PM10/31/11
to mp...@googlegroups.com

Comment #22 on issue 113 by tdeb...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

Are there any updates on newly written conversion routines or on
third-party code that could be used?

mp...@googlecode.com

unread,
Sep 9, 2012, 12:43:52 PM9/9/12
to mp...@googlegroups.com

Comment #23 on issue 113 by tdeb...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

I have removed the attached patch from my original post because it still
contained a bug. When writing a string tag to the Xtra atom, the code
incorrectly didn't include the null terminator. The result of this was a
corrupted Xtra atom. I hope that no one was already relying on my patch but
I apologize if this was the case.

A fixed version of my patch is attached to this comment. Other changes are:

- Removed superfluous commented code.
- Added some error checking to tag writing functions (result type changed
from void to bool).
- Included changes to the Visual Studio solution and projects.

Some comments/pointers on which public domain Unicode conversion routines
to use would still be welcome.

Attachments:
mp4v2_xtratags_v2.patch 61.5 KB

mp...@googlecode.com

unread,
Oct 1, 2012, 10:38:13 AM10/1/12
to mp...@googlegroups.com

Comment #24 on issue 113 by kid...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

i meant to get back to this...sorry. been stuck in crazy work project for
a few months. i do want to get this code in because I've had several
requests for Xtra atom support, so when I get a chance I'll look into what
to do about the unicode issues.

mp...@googlecode.com

unread,
Dec 19, 2012, 12:16:48 PM12/19/12
to mp...@googlegroups.com

Comment #25 on issue 113 by danahins...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

Thought I'd check in and see if anyone had gotten a chance to look at
this. Users continue to want this functionality as it appears WMP and XBox
360 use it.

mp...@googlecode.com

unread,
Dec 28, 2012, 12:02:24 PM12/28/12
to mp...@googlegroups.com

Comment #26 on issue 113 by kid...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

This might be sort of a sloppy solution, but would there be any way to have
this patch be windows only for the time being? I'd like to commit this,
because there's definitely a desire to have this feature, but without
dealing with the character encoding issues it's pretty hard to handle.

Might also have the interface be marked with "beta" or "subject to change"
if possible, since if we fixed it later, that would entail dropping wchar_t
from the public interfaces to move to UTF8 instead?

mp...@googlecode.com

unread,
Dec 28, 2012, 12:08:55 PM12/28/12
to mp...@googlegroups.com

Comment #27 on issue 113 by danahins...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

From my perspective, Windows only would be great.

mp...@googlecode.com

unread,
Dec 28, 2012, 12:13:46 PM12/28/12
to mp...@googlegroups.com

Comment #28 on issue 113 by tdeb...@gmail.com: Support for tags in
the "Xtra" atom used by Windows Media Player
http://code.google.com/p/mp4v2/issues/detail?id=113

I'm fine with that (temporary) solution as well.

mp...@googlecode.com

unread,
Dec 14, 2013, 4:17:21 PM12/14/13
to mp...@googlegroups.com

Comment #29 on issue 113 by tdebaets: Support for tags in the "Xtra" atom
Attached to this comment is a new version of the patch. As requested by
Dan, all added code now only gets compiled on Windows. I have put the Xtra
tags API/implementation behind ifdefs for MP4V2_XTRA_TAGS, and in
platform.h, I only define MP4V2_XTRA_TAGS if _WIN32 is defined.

Also, GNUmakefile.am doesn't contain the new source files anymore since I
assume that the makefile won't be used under Windows.

The other small change is that the mp4xtratags util now also checks for a
WM/SharedUserRating tag in the Xtra atom. WMP uses this tag for saving star
ratings to MPEG-4 files.

Attachments:
mp4v2_xtratags_v3.patch 60.5 KB

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
Reply all
Reply to author
Forward
0 new messages