Capturing mp3 metadate using php program getid3

527 views
Skip to first unread message

hairylarry

unread,
Jun 27, 2008, 10:01:04 PM6/27/08
to Google App Engine
Hi,

I've been working this problem for a while. The pure python/appengine
approach was not working for two reasons.

urlfetch() only fetches 1 megabyte. Most mp3 files are larger than
that.

mp3info() returns bad time results with vbr files.

So I decided to query another server in order to get the mp3 metadata.
I have used the php program getid3 before and it is GPL so that's
where I started.

I created a program called metamp3.php. Here's the code.

-------
<?
require_once('getid3/getid3.php');
$link = $_GET["link"];
$metadata = file_get_contents($link);

$tmpfname = tempnam(".", "tmp");
$handle = fopen($tmpfname, "w");
fwrite($handle, $metadata);
fclose($handle);

$getID3 = new getID3;
$ThisFileInfo = $getID3->analyze($tmpfname);
getid3_lib::CopyTagsToComments($ThisFileInfo);

unlink($tmpfname);

$length_seconds = @$ThisFileInfo['playtime_seconds'];
$length_formatted = @$ThisFileInfo['playtime_string'];
$bitrate = @$ThisFileInfo['bitrate'];
$filesize = @$ThisFileInfo['filesize'];
$sample_rate = @$ThisFileInfo['audio']['sample_rate'] ;
$channelmode = @$ThisFileInfo['audio']['channelmode'] ;
$bitrate_mode = @$ThisFileInfo['audio']['bitrate_mode'] ;
$artist = @$ThisFileInfo['comments_html']['artist'][0];
$title = @$ThisFileInfo['comments_html']['title'][0];
$album = @$ThisFileInfo['comments_html']['album'][0];

echo "{";
echo "'artist':'".$artist."',";
echo "'title':'".$title."',";
echo "'album':'".$album."',";
echo "'filesize':'".$filesize."',";
echo "'seconds':'".round($length_seconds)."',";
echo "'bitrate':'".round($bitrate)."',";
echo "'samplerate':'".$sample_rate."',";
echo "'mode':'".$channelmode."',";
echo "'bitratemode':'".$bitrate_mode."'";
echo "}";

?>
-------
getid3() works with a file on the local file system. It won't work
with links and I couldn't get it to work with the $metadata string. It
looks like I can avoid using this string by nesting
file_get_contents($link) into the fwrite() call. I haven't tested this
but it might be more efficient without the $metadata string.

The tempnam() function guarantees a unique filename. The unlink
command deletes the temporary file which is a copy of the mp3 file.

I call this program with the link to the mp3 file like this.

http://mydomain.com/metamp3.php?link=http://anotherdomain.com/a_song.mp3

And it returns a python dictionary with the desired metadata.

In the appengine code I do this.

-------
result = urlfetch.fetch("http://deltaboogie.com/metamp3/
metamp3.php?link="+songlink)
metadata = eval(result.content)
stereo = (metadata['mode'])
bitratemode = replace((metadata['bitratemode']),"a","v")
frequency=int(metadata['samplerate'])
minutes=int(metadata['seconds']) / 60
seconds=int(metadata['seconds']) % 60
bitrate=int(metadata['bitrate'])
filesize=int(metadata['filesize'])
-------

Since the result is a small text file urlfetch has no trouble
retrieving it.

Then I eval result.content to turn it into a dictionary.

And I read my metadata from the dictionary with appropriate
processing.

I have tested this with files from Delta Boogie and from the Live
Music Archive.

http://deltaboogie.com
http://archive.org

Thanks to Blixt and LogiLabs for their help getting me to understand
the difficulties I was having that eventually led to this solution.

I do think that having to use another server to get this data is not
preferred. However the way it's coded doing it this way doesn't seem
as kludgy as I thought it might.

I am certainly open to suggestions on better way to accomplish this
and I am certainly glad to help anyone else needing to capture
metadata for their project.

Thanks,

Hairy Larry

Greg

unread,
Jul 1, 2008, 4:52:47 AM7/1/08
to Google App Engine
I'd be VERY careful about using eval() on data received from untrusted
sources. I know in this case you control the other source, but if
someone managed to hack the response, they have total access to your
application.

You might want to look at using JSON instead, or simply returning the
data as a series of POST or GET arguments. If you're dead set on eval,
then at least use a salted hash to verify it hasn't been messed with.
Reply all
Reply to author
Forward
0 new messages