Using mutagen to get metadata from MP4 files

51 views
Skip to first unread message

c...@isbd.net

unread,
Apr 9, 2023, 1:41:23 PM4/9/23
to quod-libet-...@googlegroups.com
I have been a happy user of quodlibet for quite a while.

It now just happens that I'm trying to write a Python program (well,
add to one actually) to extract some metadata from MP4 files.

I can successfully do:-

>>> from mutagen.mp4 import MP4
>>> m = MP4("P1030761.MP4")
>>> print ( m.info.pprint())
MPEG-4 audio (AAC LC), 50.88 seconds, 128000 bps

but I really don't see how to extract the metadata I want to get at,
specifically the date and time.

I can see there's a class mutagen.mp4.AtomDataType, but I don't see
how to instantiate it with my MP4 file.

Can anyone tell me what I need to do and/or (even better) give me some
example code.

Thank you.

--
Chris Green

Philipp Wolfer

unread,
Apr 10, 2023, 12:30:19 PM4/10/23
to quod-libet-...@googlegroups.com
Hi Chris,

You can access the tags directly with m]'atom-name']. E.g. to get the title you would use m["©nam"]. The result is a list of values.

I'm not quite sure which date and time you are referring to, but the release date is usually stored inside m["©day"].

For some example code maybe take a look at https://github.com/quodlibet/quodlibet/blob/main/quodlibet/formats/mp4.py#L82-L113 , where the data gets loaded from a MP4 file by iterating over all tags and applying some mapping.

Best,
Philipp



 

c...@isbd.net

unread,
Apr 10, 2023, 12:58:12 PM4/10/23
to quod-libet-...@googlegroups.com
On Mon, Apr 10, 2023 at 06:29:40PM +0200, Philipp Wolfer wrote:
> Hi Chris,
>
> Am So., 9. Apr. 2023 um 19:41 Uhr schrieb <c...@isbd.net>:
>
> > I have been a happy user of quodlibet for quite a while.
> >
> > It now just happens that I'm trying to write a Python program (well,
> > add to one actually) to extract some metadata from MP4 files.
> >
> > I can successfully do:-
> >
> > >>> from mutagen.mp4 import MP4
> > >>> m = MP4("P1030761.MP4")
> > >>> print ( m.info.pprint())
> > MPEG-4 audio (AAC LC), 50.88 seconds, 128000 bps
> >
> > but I really don't see how to extract the metadata I want to get at,
> > specifically the date and time.
> >
>
> > I can see there's a class mutagen.mp4.AtomDataType, but I don't see
> > how to instantiate it with my MP4 file.
> >
> > Can anyone tell me what I need to do and/or (even better) give me some
> > example code.
> >
> > Thank you.
> >
> > --
> > Chris Green
> >
>
> You can access the tags directly with m]'atom-name']. E.g. to get the title
> you would use m["©nam"]. The result is a list of values.
>
I think something has got lost in the character sets. What does that
© symbol really represent? I just get a KeyError when I try the above.
The © symbol is a copyright C in a circle for me.


> I'm not quite sure which date and time you are referring to, but the
> release date is usually stored inside m["©day"].
>
It's one of the tag values - creation_date.

> For some example code maybe take a look at
> https://github.com/quodlibet/quodlibet/blob/main/quodlibet/formats/mp4.py#L82-L113
> , where the data gets loaded from a MP4 file by iterating over all tags and
> applying some mapping.
>
At first glance I don't really follow that code but I'll take a longer
look later.

Thanks for the help! :-)

--
Chris Green

Philipp Wolfer

unread,
Apr 10, 2023, 3:21:46 PM4/10/23
to quod-libet-...@googlegroups.com
Am Mo., 10. Apr. 2023 um 18:58 Uhr schrieb <c...@isbd.net>:

> You can access the tags directly with m]'atom-name']. E.g. to get the title
> you would use m["©nam"]. The result is a list of values.
>
I think something has got lost in the character sets.  What does that
© symbol really represent?  I just get a KeyError when I try the above.
The © symbol is a copyright C in a circle for me.


It actually is a copyright symbol. I don't know why, but Apple chose to start most atom names with the copyright symbol. Years ago Apple used to provide a document called "iTunes Metadata Format Specification" which defined many of the tags as they are now widely used. Among them ©alb for the album name,  ©ART for the artist name (note the uppercase) and ©nam for the track name. Apple no longer makes this document available. But see e.g. https://picard-docs.musicbrainz.org/en/appendices/tag_mapping.html for how MusicBrainz Picard uses these tags.
 
> I'm not quite sure which date and time you are referring to, but the
> release date is usually stored inside m["©day"].
>
It's one of the tag values - creation_date.

I'm still not sure which data you are actually looking for. The iTunes metadata specs have "©day" defined as the release date. Maybe that already is what you are looking for. Might also be that there is another tag some software commonly writes that I'm not aware of.

If you are looking for creation and modification times of the files those are actually file system attributes and not part of the metadata. In Python you can access the creation and modification time for a file using the os.path.getctime and os.path.getmtime functions (see https://docs.python.org/3/library/os.path.html#os.path.getctime).

 
> For some example code maybe take a look at
> https://github.com/quodlibet/quodlibet/blob/main/quodlibet/formats/mp4.py#L82-L113
> , where the data gets loaded from a MP4 file by iterating over all tags and
> applying some mapping.
>
At first glance I don't really follow that code but I'll take a longer
look later.


Maybe best would be that you check what tags are actually in your file. If you just take your original code and print the loaded data it'll show you all the metadata keys and values:

>>> from mutagen.mp4 import MP4
>>> m = MP4("P1030761.MP4")
>>> print(m)

Or you make it a bit nicer and iterate over m.items() like in the Quodlibet code I linked above.

Chris Green

unread,
Apr 11, 2023, 8:33:36 AM4/11/23
to quod-libet-...@googlegroups.com
On Mon, Apr 10, 2023 at 09:21:06PM +0200, Philipp Wolfer wrote:
>
> > I'm not quite sure which date and time you are referring to, but
> the
> > release date is usually stored inside m["妻ay"].
> >
> It's one of the tag values - creation_date.
>
> I'm still not sure which data you are actually looking for. The iTunes
> metadata specs have "妻ay" defined as the release date. Maybe that
> already is what you are looking for. Might also be that there is
> another tag some software commonly writes that I'm not aware of.
> If you are looking for creation and modification times of the files
> those are actually file system attributes and not part of the metadata.
> In Python you can access the creation and modification time for a file
> using the os.path.getctime and os.path.getmtime functions (see
> [3]https://docs.python.org/3/library/os.path.html#os.path.getctime).
>
Here's the json for a stream of one of my MP4 files as output by
ffprobe :-

{
"index": 1,
"codec_name": "aac",
"codec_long_name": "AAC (Advanced Audio Coding)",
"profile": "LC",
"codec_type": "audio",
"codec_tag_string": "mp4a",
"codec_tag": "0x6134706d",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 2,
"channel_layout": "stereo",
"bits_per_sample": 0,
"id": "0x2",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 2442240,
"duration": "50.880000",
"bit_rate": "125247",
"nb_frames": "2385",
"extradata_size": 2,
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0
},
"tags": {
"creation_time": "2022-10-01T17:09:31.000000Z",
"language": "und",
"vendor_id": "[0][0][0][0]"
}

It's the "creation_time" value I'm after.


> > For some example code maybe take a look at
> >
> [4]https://github.com/quodlibet/quodlibet/blob/main/quodlibet/format
> s/mp4.py#L82-L113
> > , where the data gets loaded from a MP4 file by iterating over all
> tags and
> > applying some mapping.
> >
> At first glance I don't really follow that code but I'll take a
> longer
> look later.
>
> Maybe best would be that you check what tags are actually in your file.
> If you just take your original code and print the loaded data it'll
> show you all the metadata keys and values:
> >>> from mutagen.mp4 import MP4
> >>> m = MP4("P1030761.MP4")
> >>> print(m)

>>> from mutagen.mp4 import MP4
>>> m = MP4("P1030761.MP4")
>>> print(m)
{}
>>>

P1030761.MP4 is the file from which the ffprobe output above comes so it
definitely has some metadata! :-)

--
Chris Green

Philipp Wolfer

unread,
Apr 11, 2023, 9:15:01 AM4/11/23
to quod-libet-...@googlegroups.com
So this is some metadata field specific to the stream. This is not what mutagen handles currently. mutagen primarily deals about the metdata inside the meta atom, which holds user editable data like track title, number, artist name etc.

There is some stream information made available in the info object (see https://mutagen.readthedocs.io/en/latest/api/mp4.html#mutagen.mp4.MP4Info ), but this creation_time is not read.
 

P1030761.MP4 is the file from which the ffprobe output above comes so it
definitely has some metadata! :-)


But not of the kind mutagen handles, there is either no meta atom or it's empty.

--
Philipp Wolfer

Chris Green

unread,
Apr 11, 2023, 9:18:26 AM4/11/23
to quod-libet-...@googlegroups.com
On Tue, Apr 11, 2023 at 03:14:22PM +0200, Philipp Wolfer wrote:
> So this is some metadata field specific to the stream. This is not what
> mutagen handles currently. mutagen primarily deals about the metdata
> inside the meta atom, which holds user editable data like track title,
> number, artist name etc.
> There is some stream information made available in the info object (see
> [3]https://mutagen.readthedocs.io/en/latest/api/mp4.html#mutagen.mp4.MP
> 4Info ), but this creation_time is not read.
>
> P1030761.MP4 is the file from which the ffprobe output above comes
> so it
> definitely has some metadata! :-)
>
> But not of the kind mutagen handles, there is either no meta atom or
> it's empty.

OK, thanks for answering my questions patiently. I think I will have
to use ffprobe to get the information I want.

--
Chris Green
Reply all
Reply to author
Forward
0 new messages