[QUIZ] ID3 Tags (#136)

Ruby Quiz

unread,

Aug 24, 2007, 8:34:47 AM8/24/07

to

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

The MP3 file format, didn't provide any means for including metadata about the
song. ID3 tags were invented to solve this problem.

You can tell if an MP3 file includes ID3 tags by examining the last 128 bytes of
the file. If they begin with the characters TAG, you have found an ID3 tag.
The format of the tag is as follows:

TAG song album artist comment year genre

The spaces above are just for us humans. The actual tags are fixed-width fields
with no spacing between them. Song, album, artist, and comment are 30 bytes
each. The year is four bytes and the genre just gets one, which is an index
into a list of predefined genres I'll include at the end of this quiz.

A minor change was later made to ID3 tags to allow them to include track
numbers, creating ID3v1.1. In that format, if the 29th byte of a comment is
null and the 30th is not, the 30th byte is an integer representing the track
number.

Later changes evolved ID3v2 which is a scary beast we won't worry about.

This week's Ruby Quiz is to write an ID3 tag parser. Using a library is
cheating. Roll up your sleeves and parse it yourself. It's not hard at all.

If you don't have MP3 files to test your solution on, you can find some free
files at:

http://www.mfiles.co.uk/mp3-files.htm

Here's the official genre list with some extensions added by Winamp:

Blues
Classic Rock
Country
Dance
Disco
Funk
Grunge
Hip-Hop
Jazz
Metal
New Age
Oldies
Other
Pop
R&B
Rap
Reggae
Rock
Techno
Industrial
Alternative
Ska
Death Metal
Pranks
Soundtrack
Euro-Techno
Ambient
Trip-Hop
Vocal
Jazz+Funk
Fusion
Trance
Classical
Instrumental
Acid
House
Game
Sound Clip
Gospel
Noise
AlternRock
Bass
Soul
Punk
Space
Meditative
Instrumental Pop
Instrumental Rock
Ethnic
Gothic
Darkwave
Techno-Industrial
Electronic
Pop-Folk
Eurodance
Dream
Southern Rock
Comedy
Cult
Gangsta
Top 40
Christian Rap
Pop/Funk
Jungle
Native American
Cabaret
New Wave
Psychadelic
Rave
Showtunes
Trailer
Lo-Fi
Tribal
Acid Punk
Acid Jazz
Polka
Retro
Musical
Rock & Roll
Hard Rock
Folk
Folk-Rock
National Folk
Swing
Fast Fusion
Bebob
Latin
Revival
Celtic
Bluegrass
Avantgarde
Gothic Rock
Progressive Rock
Psychedelic Rock
Symphonic Rock
Slow Rock
Big Band
Chorus
Easy Listening
Acoustic
Humour
Speech
Chanson
Opera
Chamber Music
Sonata
Symphony
Booty Bass
Primus
Porn Groove
Satire
Slow Jam
Club
Tango
Samba
Folklore
Ballad
Power Ballad
Rhythmic Soul
Freestyle
Duet
Punk Rock
Drum Solo
A capella
Euro-House
Dance Hall

Robert Dober

unread,

Aug 24, 2007, 8:47:28 AM8/24/07

to

On 8/24/07, Ruby Quiz <ja...@grayproductions.net> wrote:
> The three rules of Ruby Quiz:

<snip>

> The spaces above are just for us humans. The actual tags are fixed-width fields
> with no spacing between them. Song, album, artist, and comment are 30 bytes
> each. The year is four bytes and the genre just gets one, which is an index
> into a list of predefined genres I'll include at the end of this quiz.

zero based, I guess?
<snip>

Cheers
Robert
--
I'm an atheist and that's it. I believe there's nothing we can know
except that we should be kind to each other and do what we can for
other people.
-- Katharine Hepburn

James Edward Gray II

unread,

Aug 24, 2007, 9:53:54 AM8/24/07

to

On Aug 24, 2007, at 7:47 AM, Robert Dober wrote:

> On 8/24/07, Ruby Quiz <ja...@grayproductions.net> wrote:
>> The three rules of Ruby Quiz:
> <snip>
>> The spaces above are just for us humans. The actual tags are
>> fixed-width fields
>> with no spacing between them. Song, album, artist, and comment
>> are 30 bytes
>> each. The year is four bytes and the genre just gets one, which
>> is an index
>> into a list of predefined genres I'll include at the end of this
>> quiz.
> zero based, I guess?

It is, yes.

James Edward Gray II

Eugene Kalenkovich

unread,

Aug 24, 2007, 11:11:51 AM8/24/07

to

"Ruby Quiz" <ja...@grayproductions.net> wrote in message
news:20070824123444.YGDK2224...@eastrmimpo02.cox.net...

> TAG song album artist comment year genre
>

You've misplaced year and comment.
http://www.id3.org/ID3v1

--EK

James Edward Gray II

unread,

Aug 24, 2007, 11:15:38 AM8/24/07

to

On Aug 24, 2007, at 10:03 AM, Cédric Finance wrote:

> I think that the fields order is wrong.
> I found this:
> TAG song artist album year comment genre

You are right. Sorry about that. I've fixed it on the Ruby Quiz site.

James Edward Gray II

John Miller

unread,

Aug 24, 2007, 11:29:18 AM8/24/07

to

James Gray wrote:

> The format of the tag is as follows:

I assume that the song album artist and comment fields are NUL padded?

The 4 bytes of Year are 4 character and not a 32bit number?

John Miller
--
Posted via http://www.ruby-forum.com/.

James Edward Gray II

unread,

Aug 24, 2007, 11:43:03 AM8/24/07

to

On Aug 24, 2007, at 10:29 AM, John Miller wrote:

> James Gray wrote:
>
>> The format of the tag is as follows:
>
> I assume that the song album artist and comment fields are NUL padded?
>
> The 4 bytes of Year are 4 character and not a 32bit number?

Yes and yes. :)

James Edward Gray II

Brad Ediger

unread,

Aug 25, 2007, 6:00:09 PM8/25/07

to

Brad Ediger

unread,

Aug 25, 2007, 6:04:31 PM8/25/07

to

You fixed one problem, but artist and album are still flipped.

(this didn't come through the first time, trying without the S/MIME
signature)

James Edward Gray II

unread,

Aug 25, 2007, 6:15:32 PM8/25/07

to

On Aug 25, 2007, at 5:04 PM, Brad Ediger wrote:

> On Aug 24, 2007, at 10:15 AM, James Edward Gray II wrote:
>
>> On Aug 24, 2007, at 10:03 AM, Cédric Finance wrote:
>>
>>> I think that the fields order is wrong.
>>> I found this:
>>> TAG song artist album year comment genre
>>
>> You are right. Sorry about that. I've fixed it on the Ruby Quiz
>> site.
>
> You fixed one problem, but artist and album are still flipped.

Egad. I must have had a massive dyslexia attack when I wrote that quiz.

It should be fixed now.

James Edward Gray II

Jesse Merriman

unread,

Aug 26, 2007, 9:45:43 AM8/26/07

to

Here's my solution. Should be pretty straightforward.
id3_tags.rb takes a list of filenames as arguments:

$ ./id3_tags.rb 04_Prepare_Yourself.mp3 05_Moonloop.mp3
04_Prepare_Yourself.mp3:
song: Prepare Yourself
track: 4
artist: Porcupine Tree
comment: some comment
year: 1995
album: The Sky Moves Sideways
genre: Progressive Rock

05_Moonloop.mp3:
song: Moonloop
track: 5
artist: Porcupine Tree
comment: test comment
year: 1995
album: The Sky Moves Sideways
genre: Progressive Rock

--
Jesse Merriman
jessem...@warpmail.net
http://www.jessemerriman.com/

hashy.rb

genres.rb

id3_tags.rb

Ken Bloom

unread,

Aug 26, 2007, 10:16:32 AM8/26/07

to

class NoID3Error < StandardError
end

class ID3
Genres=" Blues

Dance Hall".split("\n").map{|x| x.gsub(/^\s+/,'')}

attr_accessor :title, :artist, :album, :year, :comment, :genre, :track
def genre_name
Genres[@genre]
end

def initialize(filename)
rawdata=open(filename) do |f|
f.seek(f.lstat.size-128)
f.read
end
tag,@title,@artist,@album,@year,@comment,@genre=rawdata.unpack "A3A30A30A30A4A30c"
if rawdata[3+30+30+30+4+28]==0
@track=rawdata[3+30+30+30+4+29]
@track=nil if @track==0
end
if tag!="TAG"
raise NoID3Error
end
end
end

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Jesse Merriman

unread,

Aug 26, 2007, 11:02:51 AM8/26/07

to

On Sunday 26 August 2007, Jesse Merriman wrote:
> Here's my solution. Should be pretty straightforward.

Here's a very slightly improved version of id3_tags.rb (which still requires
the other two files I submitted, unchanged). The only change is less ugly
use of String#[], and no more Null constant.

id3_tags.rb

come

unread,

Aug 26, 2007, 12:53:01 PM8/26/07

to

Hi,

Here is my solution :

require "delegate"

class ID3Tags < DelegateClass(Struct)
MP3_TYPE=%w(Blues Classic Rock Country Dance Disco Funk Grunge Hip-
Hop Jazz Metal New Age Oldies Other Pop R&B Rap Reggae Rock Techno

Industrial Alternative Ska Death Metal Pranks Soundtrack Euro-Techno
Ambient Trip-Hop Vocal Jazz+Funk Fusion Trance Classical Instrumental
Acid House Game Sound Clip Gospel Noise AlternRock Bass Soul Punk
Space Meditative Instrumental Pop Instrumental Rock Ethnic Gothic
Darkwave Techno-Industrial Electronic Pop-Folk Eurodance Dream
Southern Rock Comedy Cult Gangsta Top 40 Christian Rap Pop/Funk Jungle
Native American Cabaret New Wave Psychadelic Rave Showtunes Trailer Lo-

Fi Tribal Acid Punk Acid Jazz Polka Retro Musical Rock & Roll Hard

Rock Folk Folk-Rock National Folk Swing Fast Fusion Bebob Latin
Revival Celtic Bluegrass Avantgarde Gothic Rock Progressive Rock
Psychedelic Rock Symphonic Rock Slow Rock Big Band Chorus Easy
Listening Acoustic Humour Speech Chanson Opera Chamber Music Sonata
Symphony Booty Bass Primus Porn Groove Satire Slow Jam Club Tango
Samba Folklore Ballad Power Ballad Rhythmic Soul Freestyle Duet Punk

Rock Drum Solo A capella Euro-House Dance Hall)

Tag=Struct.new(:song,:album,:artist,:year,:comment,:track,:genre)

def initialize(file)
raise "No ID3 Tag detected" unless File.size(file) > 128
File.open(file,"r") do |f|
f.seek(-128, IO::SEEK_END)
tag = f.read.unpack('A3A30A30A30A4A30C1')
raise "No ID3 Tag detected" unless tag[0] == 'TAG'
if tag[5][-2] == 0 and tag[5][-1] != 0
tag[5]=tag[5].unpack('A28A1C1').values_at(0,2)
else
tag[5]=[tag[5],nil]
end
super(@tag=Tag.new(*tag.flatten[1..-1]))
end
end

def to_s
members.each do |name|
puts "#{name} : #{send(name)}"
end
end

def genre
MP3_TYPE[@tag.genre]
end

end

Come

Brad Ediger

unread,

Aug 26, 2007, 1:32:05 PM8/26/07

to

One of the biggest problems in software development is feature creep.
In the case of this Quiz, specification creep was the culprit, with
the spec being changed two times in two days. No offense intended,
JEG2 ;-)

Luckily, we can use the mighty power of Ruby to make our application
impervious to such changes, and save a couple heredocs to boot.

-------------------------

#!/usr/bin/env ruby -rubygems

%w(hpricot open-uri).each(&method(:require))

fields, genres = (Hpricot(open("http://www.rubyquiz.com/
quiz136.html")) / "p.example").map{|e| e.inner_html}
fields = fields.split
genres = genres.split "<br />"

values = IO.read(ARGV.first)[-128..-1].unpack("A3 A30 A30 A30 A4 A30 A")

unless values.first == 'TAG'
puts "No ID3 tag found"
exit 1
end

fields.zip(values).each do |field, value|
case field # this feels dirty
when 'TAG': # nada
when 'genre': puts "#{field}: #{genres[value[0]]}"
when 'comment'
puts "#{field}: #{value}"
if value[28].to_i.zero? && !value[29].to_i.zero? # ID3v1.1
puts "track: #{value[29]}"
end
else puts "#{field}: #{value}"
end
end

Brad Ediger

unread,

Aug 26, 2007, 1:39:39 PM8/26/07

to

On Aug 26, 2007, at 11:55 AM, come wrote:

> Hi,
>
> Here is my solution :
>
> require "delegate"
>
> class ID3Tags < DelegateClass(Struct)
> MP3_TYPE=%w(Blues Classic Rock Country Dance Disco Funk Grunge Hip-
> Hop Jazz Metal New Age Oldies Other Pop R&B Rap Reggae Rock Techno
> Industrial Alternative Ska Death Metal Pranks Soundtrack Euro-Techno
> Ambient Trip-Hop Vocal Jazz+Funk Fusion Trance Classical Instrumental
> Acid House Game Sound Clip Gospel Noise AlternRock Bass Soul Punk
> Space Meditative Instrumental Pop Instrumental Rock Ethnic Gothic
> Darkwave Techno-Industrial Electronic Pop-Folk Eurodance Dream
> Southern Rock Comedy Cult Gangsta Top 40 Christian Rap Pop/Funk Jungle
> Native American Cabaret New Wave Psychadelic Rave Showtunes Trailer
> Lo-
> Fi Tribal Acid Punk Acid Jazz Polka Retro Musical Rock & Roll Hard
> Rock Folk Folk-Rock National Folk Swing Fast Fusion Bebob Latin
> Revival Celtic Bluegrass Avantgarde Gothic Rock Progressive Rock
> Psychedelic Rock Symphonic Rock Slow Rock Big Band Chorus Easy
> Listening Acoustic Humour Speech Chanson Opera Chamber Music Sonata
> Symphony Booty Bass Primus Porn Groove Satire Slow Jam Club Tango
> Samba Folklore Ballad Power Ballad Rhythmic Soul Freestyle Duet Punk
> Rock Drum Solo A capella Euro-House Dance Hall)

That's not going to work like you think it will:

>> %w(New Age)
=> ["New", "Age"]

James Edward Gray II

unread,

Aug 26, 2007, 2:08:14 PM8/26/07

to

On Aug 26, 2007, at 8:45 AM, Jesse Merriman wrote:

> Here's my solution.

Here's my own:

#!/usr/bin/env ruby -w

GENRES = %w[ Blues Classic\ Rock Country Dance Disco Funk Grunge Hip-
Hop Jazz

Metal New\ Age Oldies Other Pop R&B Rap Reggae Rock Techno
Industrial Alternative Ska Death\ Metal Pranks Soundtrack
Euro-Techno Ambient Trip-Hop Vocal Jazz+Funk Fusion Trance
Classical Instrumental Acid House Game Sound\ Clip
Gospel Noise
AlternRock Bass Soul Punk Space Meditative Instrumental
\ Pop
Instrumental\ Rock Ethnic Gothic Darkwave Techno-
Industrial
Electronic Pop-Folk Eurodance Dream Southern\ Rock
Comedy Cult
Gangsta Top\ 40 Christian\ Rap Pop/Funk Jungle Native\
American
Cabaret New\ Wave Psychadelic Rave Showtunes Trailer Lo-

Fi Tribal

Acid\ Punk Acid\ Jazz Polka Retro Musical Rock\ &\ Roll
Hard\ Rock
Folk Folk-Rock National\ Folk Swing Fast\ Fusion Bebob
Latin
Revival Celtic Bluegrass Avantgarde Gothic\ Rock
Progressive\ Rock
Psychedelic\ Rock Symphonic\ Rock Slow\ Rock Big\ Band
Chorus
Easy\ Listening Acoustic Humour Speech Chanson Opera
Chamber\ Music
Sonata Symphony Booty\ Bass Primus Porn\ Groove Satire
Slow\ Jam
Club Tango Samba Folklore Ballad Power\ Ballad Rhythmic
\ Soul
Freestyle Duet Punk\ Rock Drum\ Solo A\ capella Euro-House

Dance\ Hall ]

abort "Usage: #{File.basename($PROGRAM_NAME)} MP3_FILE" unless
ARGV.size == 1

tag, song, artist, album, year, comment, genre =
ARGF.read[-128..-1].unpack("A3A30A30A30A4A30C")
if comment.size == 30 and comment[28] == ?\0
track = comment[29]
comment = comment[0..27].strip
else
track = nil
end

abort "ID3v1 tag not found." unless tag == "TAG"

puts "Song: #{song}"
puts "Artist: #{artist}"
puts "Album: #{album}"
puts "Comment: #{comment}" unless comment.empty?
puts "Track: #{track}" unless track.nil?
puts "Year: #{year}"
puts "Genre: #{GENRES[genre] || 'Unknown'}"

__END__

James Edward Gray II

unread,

Aug 26, 2007, 2:19:58 PM8/26/07

to

On Aug 26, 2007, at 12:32 PM, Brad Ediger wrote:

> One of the biggest problems in software development is feature
> creep. In the case of this Quiz, specification creep was the
> culprit, with the spec being changed two times in two days. No
> offense intended, JEG2 ;-)

I just did that to inspire you to such a clever solution. ;)

James Edward Gray II

come

unread,

Aug 26, 2007, 2:49:29 PM8/26/07

to

Yes, you are right, I answered a little bit to fast ;-)

come

unread,

Aug 26, 2007, 3:08:34 PM8/26/07

to

My corrected version :

require "delegate"

class ID3Tags < DelegateClass(Struct)
MP3_TYPE=["Blues","Classic
Rock","Country","Dance","Disco","Funk","Grunge","Hip-
Hop","Jazz","Metal","New
Age","Oldies","Other","Pop","R&B","Rap","Reggae","Rock","Techno","Industrial","Alternative","Ska","Death
Metal","Pranks","Soundtrack","Euro-Techno","Ambient","Trip-
Hop","Vocal","Jazz
+Funk","Fusion","Trance","Classical","Instrumental","Acid","House","Game","Sound
Clip","Gospel","Noise","AlternRock","Bass","Soul","Punk","Space","Meditative","Instrumental
Pop","Instrumental Rock","Ethnic","Gothic","Darkwave","Techno-
Industrial","Electronic","Pop-Folk","Eurodance","Dream","Southern
Rock","Comedy","Cult","Gangsta","Top 40","Christian Rap","Pop/
Funk","Jungle","Native American","Cabaret","New
Wave","Psychadelic","Rave","Showtunes","Trailer","Lo-
Fi","Tribal","Acid Punk","Acid Jazz","Polka","Retro","Musical","Rock &
Roll","Hard Rock","Folk","Folk-Rock","National Folk","Swing","Fast
Fusion","Bebob","Latin","Revival","Celtic","Bluegrass","Avantgarde","Gothic
Rock","Progressive Rock","Psychedelic Rock","Symphonic Rock","Slow
Rock","Big Band","Chorus","Easy
Listening","Acoustic","Humour","Speech","Chanson","Opera","Chamber
Music","Sonata","Symphony","Booty Bass","Primus","Porn
Groove","Satire","Slow
Jam","Club","Tango","Samba","Folklore","Ballad","Power
Ballad","Rhythmic Soul","Freestyle","Duet","Punk Rock","Drum Solo","A
capella","Euro-House","Dance Hall"]

Brad Ediger

unread,

Aug 26, 2007, 3:14:34 PM8/26/07

to

On Aug 26, 2007, at 1:08 PM, James Edward Gray II wrote:

> ARGF.read[-128..-1].unpack("A3A30A30A30A4A30C")

Well played, sir. I always forget about ARGF. And to think I call
myself a Perl nerd.

-be

Ken Bloom

unread,

Aug 26, 2007, 3:38:06 PM8/26/07

to

On Mon, 27 Aug 2007 02:32:05 +0900, Brad Ediger wrote:

> One of the biggest problems in software development is feature creep. In
> the case of this Quiz, specification creep was the culprit, with the
> spec being changed two times in two days. No offense intended, JEG2 ;-)
>
> Luckily, we can use the mighty power of Ruby to make our application
> impervious to such changes, and save a couple heredocs to boot.
>
> -------------------------
>
> #!/usr/bin/env ruby -rubygems
>
> %w(hpricot open-uri).each(&method(:require))
>

> fields, genres = (Hpricot(open("http://www.rubyquiz.com/quiz136.html")) / "p.example").map{|e| e.inner_html}

> fields = fields.split
> genres = genres.split "<br />"

You hard-coded the value of the unpack field. If you wanted to download the spec
properly, you'd generate that from the spec like follows. (Picking up from the end
of what I've quoted above)

unpacktypes=Hash.new("A30")
unpacktypes["TAG"]="A3"
unpacktypes["year"]="A4"
unpacktypes["genre"]="c"
unpackstr=fields.map{|x| unpacktypes[x]}.join

id3=Hash.new
raw=open('/home/bloom/scratch/music/rondo.mp3') do |f|
f.seek(f.lstat.size-128)
f.read
end

values=raw.unpack(unpackstr)

fields.zip(values).each do |field,value|
id3[field]=value
end

fail if id3["TAG"]!="TAG"

if id3["comment"].length==30 and id3["comment"][-2]==0
id3["track"]=id3["comment"][-1]
id3["comment"]=id3["comment"][0..-2].strip
end

id3["genre"]=genres[id3["genre"]] || "Unknown"
p id3

Ken Bloom

unread,

Aug 26, 2007, 3:38:06 PM8/26/07

to

Apparently unpack('A30') doesn't work quite the way I thought --
it only shortens the string if the string ends in null characters.
If there are nulls in the middle, then those and the characters after
them are preserved.

--Ken

Brad Ediger

unread,

Aug 26, 2007, 3:50:05 PM8/26/07

to

On Aug 26, 2007, at 2:40 PM, Ken Bloom wrote:

> On Mon, 27 Aug 2007 02:32:05 +0900, Brad Ediger wrote:
>
>> One of the biggest problems in software development is feature
>> creep. In
>> the case of this Quiz, specification creep was the culprit, with the
>> spec being changed two times in two days. No offense intended,
>> JEG2 ;-)
>>
>> Luckily, we can use the mighty power of Ruby to make our application
>> impervious to such changes, and save a couple heredocs to boot.
>>
>> -------------------------
>>
>> #!/usr/bin/env ruby -rubygems
>>
>> %w(hpricot open-uri).each(&method(:require))
>>
>> fields, genres = (Hpricot(open("http://www.rubyquiz.com/
>> quiz136.html")) / "p.example").map{|e| e.inner_html}
>> fields = fields.split
>> genres = genres.split "<br />"
>
> You hard-coded the value of the unpack field.

I know, I felt bad about doing it (and this was more of a "ha-ha,
have fun with the Quiz" submission than a "use this in production"
submission).

I was about to rewrite it to scrape the actual data structure from
the table in http://www.id3.org/ID3v1, but then I'd have to find
another quasi-official source for the genre list, and it began to
feel more like work.

I like your solution. Yes, I should have used a "c" for the genre
field, but my brain wasn't working.

-be

Johannes Held

unread,

Aug 26, 2007, 4:50:02 PM8/26/07

to

Brad Ediger schrieb:

What the heck is ARGF?

--
Gruß, Johannes
Täglich http://blog.hehejo.de und du fühlst dich gut.

Joel VanderWerf

unread,

Aug 26, 2007, 5:05:18 PM8/26/07

to

Johannes Held wrote:
> What the heck is ARGF?

It's a pseudo-IO that reads the concatenation of the files named in
ARGV, unless ARGV is empty, in which case it just reads standard input.
It's very useful in writing little command-line programs that can be
used as filters or on a list of named files (after you delete any
switches or options from the command line).

[~] cat >foo.txt
foo
[~] cat >bar.txt
bar
[~] ruby -e 'puts ARGF.read' foo.txt bar.txt
foo
bar

[~] echo zap | ruby -e 'puts ARGF.read'
zap

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Tom Metge

unread,

Aug 26, 2007, 5:16:33 PM8/26/07

to

Hey all, here's another one for you. I admit that there isn't anything
special about it... I think it's one of the more direct solutions (i.e.
Nothing clever here guys). I didn't see a reason to include the entire
genre, so it's attached in a separate file. It simply declares a constant
(an array which is indexed in read_tags).

Tom

--BEGIN SOLUTION--
require 'id3_tag_genre'

class NoTagError < RuntimeError; end

class Mp3
attr_reader :song, :artist, :album, :year, :comment, :genre, :track

def initialize(file)
read_tags(file)
end

def read_tags(file)
begin
size = File.stat(file).size
f = File.open(file)
f.pos = size - 128
tag = f.read
raise NoTagError unless tag[0..2] == "TAG"
@song = tag[3..32].strip
@artist = tag[33..62].strip
@album = tag[63..92].strip
@year = tag[93..96].strip
@comment = tag[97..126]
if @comment[28] == 0 && @comment[29] != 0
@track = @comment[29..29].to_i
@comment = @comment[0..28].strip
end
@genre = Genre[tag[127]]
rescue NoTagError
puts "No tags found!"
return false
end
true
end
end

id3_tag_genre.rb

Johannes Held

unread,

Aug 26, 2007, 6:38:33 PM8/26/07

to

Joel VanderWerf schrieb:

> Johannes Held wrote:
>> What the heck is ARGF?>
> It's a pseudo-IO that reads the concatenation of the files named in
> ARGV, unless ARGV is empty, in which case it just reads standard input.
> It's very useful in writing little command-line programs that can be
> used as filters or on a list of named files (after you delete any
> switches or options from the command line).

Thank you.

Erik Bryn

unread,

Aug 26, 2007, 7:18:48 PM8/26/07

to

Here's mine. Takes a directory as input and exports a tab-seperated
list.

- Erik

--

GENRES = ["Blues", "Classic Rock", "Country", "Dance", "Disco",
"Funk", "Grunge", "Hip-Hop", "Jazz", "Metal", "New Age", "Oldies",

"Other", "Pop", "R&B", "Rap", "Reggae", "Rock", "Techno",
"Industrial", "Alternative", "Ska", "Death Metal", "Pranks",

"Soundtrack", "Euro-Techno", "Ambient", "Trip-Hop", "Vocal", "Jazz

+Funk", "Fusion", "Trance", "Classical", "Instrumental", "Acid",
"House", "Game", "Sound Clip", "Gospel", "Noise", "AlternRock",
"Bass", "Soul", "Punk", "Space", "Meditative", "Instrumental Pop",
"Instrumental Rock", "Ethnic", "Gothic", "Darkwave", "Techno-
Industrial", "Electronic", "Pop-Folk", "Eurodance", "Dream", "Southern
Rock", "Comedy", "Cult", "Gangsta", "Top 40", "Christian Rap", "Pop/
Funk", "Jungle", "Native American", "Cabaret", "New Wave",

"Psychadelic", "Rave", "Showtunes", "Trailer", "Lo-Fi", "Tribal",

"Acid Punk", "Acid Jazz", "Polka", "Retro", "Musical", "Rock & Roll",
"Hard Rock", "Folk", "Folk-Rock", "National Folk", "Swing", "Fast
Fusion", "Bebob", "Latin", "Revival", "Celtic", "Bluegrass",
"Avantgarde", "Gothic Rock", "Progressive Rock", "Psychedelic Rock",
"Symphonic Rock", "Slow Rock", "Big Band", "Chorus", "Easy Listening",
"Acoustic", "Humour", "Speech", "Chanson", "Opera", "Chamber Music",
"Sonata", "Symphony", "Booty Bass", "Primus", "Porn Groove", "Satire",
"Slow Jam", "Club", "Tango", "Samba", "Folklore", "Ballad", "Power
Ballad", "Rhythmic Soul", "Freestyle", "Duet", "Punk Rock", "Drum
Solo", "A capella", "Euro-House", "Dance Hall"]

FIELDS = [:song, :artist, :album, :year, :comment, :genre]

def find_track_number(fields)
if fields[:comment][-2] == 0 && fields[:comment][-1] != 0
fields[:track_number] = fields[:comment].slice!(-2..-1)[1]
fields[:comment].strip!
end
end

abort "Usage: #{File.basename($PROGRAM_NAME)} <dir>" unless ARGV.size
== 1
Dir["#{ARGV.first}/*.mp3"].each do |path|
File.open(path, 'rb') do |f|
f.seek(-128, IO::SEEK_END)
bytes = f.read
next if bytes.slice!(0..2) != "TAG"

tags = Hash[*FIELDS.zip(bytes.unpack('A30A30A30A4A30C')).flatten]
tags[:genre] = GENRES[tags[:genre]]
find_track_number(tags)
puts "#{File.basename(path)}\t#{tags[:artist]}\t#{tags[:song]}
\t#{tags[:album]}\t#{tags[:track_number]}\t#{tags[:year]}
\t#{tags[:genre]}\t#{tags[:comment]}"
end
end

Alpha Chen

unread,

Aug 27, 2007, 12:04:45 PM8/27/07

to

My fairly straightforward solution:

class ID3
genre_list = <<-GENRES
Blues
... # snipped for brevity
Dance Hall
GENRES

GENRE_LIST = genre_list.split("\n")
TAGS = [ :title, :artist, :album, :year, :comment, :track, :genre ]

attr_accessor *TAGS

def initialize(filename)
id3 = File.open(filename) do |mp3|
mp3.seek(-128, IO::SEEK_END)
mp3.read
end

raise "No ID3 tags" if id3 !~ /^TAG/

@title, @artist, @album, @year, @comment, @genre =
id3.unpack('xxxA30A30A30A4A30C1')
@comment, @track = @comment.unpack('Z*@28C1') if @comment =~ /
\0.$/

@genre = GENRE_LIST[@genre]
end
end

if __FILE__ == $0
id3 = ID3.new(ARGV.shift)
ID3::TAGS.each do |tag|
puts "#{tag.to_s.capitalize.rjust(8)}: #{id3.send(tag)}"
end
end

John Miller

unread,

Aug 27, 2007, 12:12:42 PM8/27/07

to

Here is my go at things:

__BEGIN__
#Note: this script assumes Ruby 1.8.6 style handeling of strings. Some
changes
#will need to be made for Ruby 1.9 to work correctly

require 'genre.rb' #an array of the official genera list

def id3(filename)
id3 = File.open(filename,'r') do |file|
file.seek(-128,IO::SEEK_END) #get to the end of the file
file.read(128)
end
return "" unless id3 #protect against read error
if id3.slice(0,3) == "TAG"
#Skip the first 3 bytes grab three thirty byte fields
#and a 4 byte field dropping trailing whitespace.
#While we can assume the old style comment field and
#take 30 bytes (we'll com back for the track number later)
#we must use 'Z' instead of 'A' to avoid having the track
#show up in our comment field.
#The last byte is the genre index.
song,artist,album,year,comment,genre = id3.unpack
"x3A30A30A30A4Z30C"
#grab the track with a pain slice
track = id3.slice(-2) if id3.slice(-3) == 0 && id3.slice(-2) != 0
desc = "#{artist}: #{album}(#{year})\n"
desc << " #{song}. "
desc << "tr. #{track}" if track
desc <<"\n"
desc << " Comment: #{comment.chomp(" ")}\n" if comment.length != 0
desc << " Genre: #{Genres[genre]}\n"
return desc
end

return "" #tag not forund

end

#usage id3.rb filename [filename*]
ARGV.each do |filename|
puts filename
puts id3(filename) if File.exists? filename
puts "\n"
end

__END__

I think the only real difference between what I'm seeing on this list
and my own solution is the unpack string. The 'Comment' filed must use
'Z' and strip trailing white space separately otherwise the track number
could get pulled and stuck on the end of the output.

I like the use of ARGF in other implementations. Something new to put
in my hat.

Matthew Moss

unread,

Aug 29, 2007, 7:56:28 PM8/29/07

to

I've been extremely busy lately, but I wanted to give this one a try.
This solution is not complete as far as the problem specification
goes, but my bit o' metaprogramming-type stuff works, though I'd have
liked to push it further.

class ID3

@@recLen = 0

def ID3.field(name, len, flags=[])
class_eval(%Q[
def #{name}
@data[#{@@recLen}, #{len}].strip
end
])

unless flags.include?(:readonly)
class_eval(%Q[
def #{name}=(val)
# need to pad val to len
@data[#{@@recLen}, #{len}] = val.ljust(#{len}, "\000")
end
])
end
@@recLen += len
end

# --------------------------------------------------------------
# name, length, flags
field :sig, 3, [:readonly]
field :song, 30
field :album, 30
field :artist, 30
field :year, 4
field :comment, 30
field :genre, 1

TAG_SIG = "TAG"
TAG_SIZE = @@recLen
raise "ID3 tag size not 128!" unless TAG_SIZE == 128

# --------------------------------------------------------------

def ID3.createFromBuffer(buffer)
ID3.new(buffer)
end

def ID3.createFromFile(fname)
size = File.size?(fname)
raise "Missing or empty file" unless size
raise "Invalid file" if size < TAG_SIZE

# Read the tag and pass to createFromBuffer
open(fname, "rb") do |f|
f.seek(-TAG_SIZE, IO::SEEK_END)
createFromBuffer(f.read(TAG_SIZE))
end
end

# --------------------------------------------------------------

def initialize(data)
@data = data

raise "Wrong buffer size" unless @data.size == TAG_SIZE
raise "ID3 tag not found" unless self.sig == TAG_SIG
end

end

id = ID3.createFromFile("maple-leaf-rag.mp3")
puts id.song

James Edward Gray II

unread,

Aug 29, 2007, 10:30:36 PM8/29/07

to

On Aug 29, 2007, at 6:56 PM, Matthew Moss wrote:

> I've been extremely busy lately, but I wanted to give this one a try.
> This solution is not complete as far as the problem specification
> goes, but my bit o' metaprogramming-type stuff works, though I'd have
> liked to push it further.

This is a very clever solution. I have one suggestion though…

> class ID3
>
> @@recLen = 0
>
> def ID3.field(name, len, flags=[])

Changing flags=[] to *flags gives a nicer interface, I think.

James Edward Gray II

Matthew Moss

unread,

Aug 30, 2007, 11:30:50 AM8/30/07

to

True... I had thought of that this morning, though I also wanted to
add a conversion parameter... so a lambda or block could be provided
that would convert between the record's string data and an integer
(e.g. the ID3 year).

Matthew Moss

unread,

Aug 30, 2007, 11:35:51 AM8/30/07

to

And, of course, the whole field/record thingy should be separated out
into its own class/module/whatever. I did see bit-struct out there,
and considered a solution using that, but it felt weird to be doing
things at a bit-level, so I just kept on with my own.

Ruby Quiz

unread,

Aug 30, 2007, 1:58:44 PM8/30/07

to

This quiz was another idea I got out of the Erlang book. The author uses a
similar example to show how smooth processing binary data in Erlang can be. I'm
happy to say that I found the submitted Ruby solutions to be equally smooth, if
not more so.

The secret to binary parsing in Ruby is generally the String.unpack() method and
the majority of the solutions capitalized on this technique. Technically, ID3
tags are mainly in plain text, with some null characters thrown in. Still, I
think it's a good idea to get into the unpack() mindset anytime you start
slicing up binary data.

I want to take a look at Eugene Kalenkovich's code below. It's a pretty typical
usage of unpack() to parse some data. It also includes a nicety when reading
the file that I'm ashamed to admit I didn't think of. Let's start with that:

def fileTail (file, offset)
f=File.new(file)
f.seek(-offset,IO::SEEK_END)
f.read
end

# ...

In my own code, I read the whole file into memory and indexed out the last 128
bytes. That's almost always the wrong approach and Eugene shows the correct
strategy above. This code just opens the file, seek()s to offset bytes before
the end, and read()s the needed data. That scales much better when the data
sizes are significant.

As a quick aside, file_tail() would probably be a more Rubyish method name.

The code now builds a data structure class to hold the tag details. It starts
like this:

# ...

class ID3Tag
GENRES=["Blues","Classic Rock","Country",…,"Dance Hall"]
attr_reader :title, :artist, :album, :year, :comment, :genre, :track

# ...

You can see that this class is mainly just a data structure that defines readers
for all of the elements in a tag. I've trimmed the GENRES listing here, but the
code included the full set.

I will say that some found more clever means to load the GENRES Array. Several
people did fancy heredoc manipulations, but the most clever pulled the list out
of the quiz document using open-uri and hpricot. That was especially wise this
time since I made so many mistakes in the quiz description.

We're now ready for the actual parsing code:

# ...

def initialize fname
tag,@title,@artist,@album,@year,@comment,@genre=
fileTail(fname,128).unpack "A3A30A30A30A4A30C"
raise "No ID3 Info" if tag!='TAG'
s_com,flag,track=@comment.unpack "A28CC"
if flag==0 and track!=0
@comment=s_com
@track=track
end
@genre=GENRES[@genre]
@genre="Unknown" if !@genre
end
end

# ...

As you can see, the majority of the work is done on the first line with a single
call to unpack(). The template fed to unpack() is the key to the whole puzzle.
An "A" in the unpack() template instructs it to extract a String, removing any
trailing spaces or null characters. By default the String is just one character
long, but you can provide a number after the "A" to increase that count. The
only other character used in the template is a "C" which is used to extract one
character as an Integer. The unpack() call returns an Array which Eugene just
mass-assigns to the relevant variables.

The rest is simple. The code checks the first chunk for the identifying "TAG"
String and throws an error if it's not there. Then another call to unpack(),
with a template much like the first, pulls the track field out of the comment.
The if statement makes sure that assignment only happens when it is present.
The final two lines are just a longhand form of:

@genre = GENRES[@genre] || "Unknown"

With all of the fields stored away in the proper variables, reader calls can be
used to extract as needed. Eugene's actual application code just punted on that
point though:

# ...

p ID3Tag.new(ARGV[0])

My thanks to all who have helped me with my Erlang comparisons these last two
weeks. I promise, we're on to new topics now.

In fact, tomorrow we will tackle an interesting subproblem from this year's ICFP
contest...