The Music Ontology: a new ontology based on the MusicBrainz project

18 views
Skip to first unread message

Frederick Giasson

unread,
Dec 21, 2006, 10:23:56 AM12/21/06
to music-ontology-sp...@googlegroups.com

Internet changed the music industry. At first, sharing systems like
Napster allowed people to share any song they had on their computer
with millions other people. That new reality changed the music
industry's landscape for good, and many juridical battles followed.
However, a biggest change followed a couple of years later.
Communities like MySpace started to appear. Strong of millions of
regular users, such communities helped garage bands and obscure
musicians to create their musical niche: the longtail of the music
industry.

This second change is more profound than the first one: now any
musician has the possibility to reach their audience by sharing their
work on the Web. In the mean time, a free database called MusicBrainz
[1] archiving million of between artists, albums and tracks appeared;
music suggesting services like Pandora started to appear and Apple
started to sell individual tracks at 1$ with iTunes.


At that point, the music industry of the eighties leaded by
blockbusters was completely changed.

Introduction

I am pleased to announce you the publication of a new Music Ontology
Specification [2]. I spent the last days writing it having in mind to
describe the new MusicBrainz metadatabase structure using RDF and
ultimately to write a specification that any music content creator/
publisher could use to export the data they are generating.

Please let me know if you find any error in that new ontology, if you
have any suggestion to enhance it or if you have any comments.

You can leave comments/suggestions on this blog post or on the
related Google Group [3]


The Music Ontology

The Music Ontology is an attempt to link all the information about
musical Artists, Albums and Tracks together: from MusicBrainz to
MySpace. The goal is to express all relations between musical
information to help people finding anything about music and
musicians. It is based around the use of machine readable information
provided by any web site or web service on the Web.


Why another music ontology?

Leigh Dodds [4] wrote an ontology based on MusicBrainz about 3 years
ago called the MusicBrainz Metadata Vocabulary [5]. At that time, the
MusicBrainz database was not as developed as the one available today.

For that reason, I choose to write a new ontology, also based on the
MusicBrainz project considering that source of information about
music. I developed that new ontology having three goals in mind:

1. I needed to stay as close as possible to the MusicBrainz database.

2. I need to reuse the basic principles of the MusicBrainz Metadata
Vocabulary.

3. I need, at the same time, to develop a music ontology that people
could use in their system (MySpace, Pandora, blogs, Etc.) and not
just in conjunction with the MusicBrainz relational database.


The first goal explains why this new ontology is so influenced by the
MusicBrainz database. In fact, most of the classes, properties came
from the relations described in the database, and most of the
descriptions of these relations came from the wiki of the project.

The second goal explains why the basic classes of the Music Ontology
are the same as the one in the MusicBrainz Medata Vocabulary.

The third goal explains why the name and the namespace of the
MusicBrainz Metadata Vocabulary have been changed.


What next?

From that point, I will export a RDF version of the MusicBrainz
ontology using that new music ontology. Then I'll index this new RDF
data into the triple store, based on Ping the Semantic Web, I talked
about a couple of weeks ago (a first version should be released soon
by the way).

From that point, people will be able to query the MusicBrainz
ontology using the SPARQL endpoint. As I shown in the specification
with a couple of SPARQL queries [6], people will have much more ways
to query the database to answer their questions about music things.


For more information please read the entire Music Ontology
Specification [2].

The ontology has 19 classes and 58 properties.

The namespace of the ontology is http://purl.org/ontology/mo and the
prefix I suggest to use is "mo".

[1] http://musicbrainz.org/
[2] http://pingthesemanticweb.com/ontology/mo/
[3] http://groups.google.com/group/music-ontology-specification-group
[4] http://www.ldodds.com/
[5] http://www.ldodds.com/projects/musicbrainz/schema/mb.html
[6] http://pingthesemanticweb.com/ontology/mo/#sec-sparqlexample

Salutations,


Frederick Giasson

danbri

unread,
Dec 21, 2006, 5:34:48 PM12/21/06
to Music Ontology Specification Group
Hi

this looks just great, though I have only skimmed so far.

I have lately been trying to get a copy of the MB database running on
my dev laptop, to see if I could use on of the SQL to RDF mappers
(probably SquirrelRDF, maybe D2RQ or Virtuoso) to express an RDF view
of the data. Nothing running yet, ... but my lesson learnt was that it
is easier to install the entire Perl mb_server package, rather than try
to import just the raw SQL. There are Perl scripts to handle the
import. Would this approach be interesting to you? It would make a nice
scalbility testbed...

It seems you are handling the cool new Advanced Relationships stuff,
which is great. Have you any plan for change control, since MB is
evolving in place. Just keep updating the schema to match the MB spec?

Would you be interested in hosting the namespace under xmlns.com
alongside FOAF? I am investigating longevity of that domain since FOAF
is so widely used, and since my dayjob at the moment relates to long
term data preservation.

Any thoughts on how it relates to Foafing the music,
http://foafing-the-music.iua.upf.edu/index.html ... or on the
RDFization of Wikipedia underway via Semantic MediaWiki (MB has some
wikipedia links as special relations, and foaf has foaf:isPrimaryTopic
of ... can all those be fitted together somehow to join the datasets?).
Also the BBC Creative Archive database (currently offline still I
think) used bits of FOAF and had mention of a lot of artists, cos it
covered radio etc.http://www.hackdiary.com/archives/000071.html

KUTGW!

Dan

Frederick Giasson

unread,
Dec 21, 2006, 10:36:54 PM12/21/06
to music-ontology-sp...@googlegroups.com
Hi Dan,

> this looks just great, though I have only skimmed so far.

Thank you for having took the time to take a look at it.

Thanks for the comment, but have in mind that this is a first version
and many things have to be changed I think.

In fact I already changed some little things that DanC told me when
he took a look at it and I already have another idea (from DanC too)
to handle the linkto_ relations better.


> I have lately been trying to get a copy of the MB database running
on
> my dev laptop, to see if I could use on of the SQL to RDF mappers
> (probably SquirrelRDF, maybe D2RQ or Virtuoso) to express an RDF
view
> of the data. Nothing running yet, ... but my lesson learnt was that
it
> is easier to install the entire Perl mb_server package, rather than
try
> to import just the raw SQL. There are Perl scripts to handle the
> import. Would this approach be interesting to you? It would make a
nice
> scalbility testbed...
>

Yeah, thank you for the advice and I will naturally take it a high
consideration.

But I was also wondering about the updates of the database. I mean,
if 10k documents are changed/added each month, is there a way with
that DB to know which has been updated?

If the answer if no, then we will have to re-index the RDF mapping of
the MBZ database (how many gigs of SQL? Probably too much ;) ).
Considering that it should take some days, it would be nice to do
that another way ;)

But I'll have a better knowledge about that issue later.


> It seems you are handling the cool new Advanced Relationships stuff,
> which is great. Have you any plan for change control, since MB is
> evolving in place. Just keep updating the schema to match the MB
spec?


Yeah, I described most of them. But for some of them I am using FOAF,
REL, etc.


Also, I didn't had the time to write the ontology yet, but I also
developped an ontology called SIM (Similarity) to describe the level
of similarity between two objects according to a relation.

In fact I re-used an idea of a guy that published a blog post about a
conference he saw on fuzzy logic. I don't remember its name, but I
can give it to you when I'll be back home tomorrow.


> Would you be interested in hosting the namespace under xmlns.com
> alongside FOAF? I am investigating longevity of that domain since
FOAF
> is so widely used, and since my dayjob at the moment relates to long
> term data preservation.

Yeah sure, I will certainly consider that option. Let me think about
tomorrow, but it should not be a problem.


> Any thoughts on how it relates to Foafing the music,
> http://foafing-the-music.iua.upf.edu/index.html ... or on the
> RDFization of Wikipedia underway via Semantic MediaWiki (MB has some
> wikipedia links as special relations, and foaf has foaf:
isPrimaryTopic
> of ... can all those be fitted together somehow to join the
datasets?).

No idea for Foafing the music but it seems great. Give me some time
tomorrow to take a deeper look at it and I will re-send my thoughts
about it.

Yeah sure that I thought about Wikipedia :)

In fact, I am building a triple store (virtuoso) with PTSW data:
mainly sioc and foaf. Also: musicbrainz, wikipedia, geonames, DBLP
database.

From there, you will have many query options :)

For wikipedia, I am thinking to create my own mappping of the
database dmp using WikiOnt.

to relate it with the Music Ontology, DanC talked about using foaf:
primaryTopic instead of linkto_wikipedia.

But it gives me another idea to generalize the linkto_ properties (in
fact, getting rid of them and generalizing the concept).

However, I'll put my first idea on this mailling list to get
feedbacks before changing anything.


In any case, it is sure that the goal here is to make the Music
Ontology and a rdf version of Wikipedia fitting together and
queriable via a beautiful and simple sparql query.

It is already possible with the current version of MO and Wikipedia3
but I am not 100% satisfied with it.


> Also the BBC Creative Archive database (currently offline still I
> think) used bits of FOAF and had mention of a lot of artists, cos it
> covered radio etc.http://www.hackdiary.com/archives/000071.html


Yeah, I think the project is in revision after the beta phase, at
least I think I read that somewhere.


Okay, enough for tonigh, sorry if this email is not that clear but I
will re-answer some part of it tomorrow with a new mind after a good
night ;)


Take care and thanks for these comments/suggestions/feebacks/etc.

Salutations,


Fred

Dan Brickley

unread,
Dec 22, 2006, 12:18:36 AM12/22/06
to music-ontology-sp...@googlegroups.com
real quick 1 topic reply for now:

> > I have lately been trying to get a copy of the MB database running
> on
> > my dev laptop, to see if I could use on of the SQL to RDF mappers
> > (probably SquirrelRDF, maybe D2RQ or Virtuoso) to express an RDF
> view
> > of the data. Nothing running yet, ... but my lesson learnt was that
> it
> > is easier to install the entire Perl mb_server package, rather than
> try
> > to import just the raw SQL. There are Perl scripts to handle the
> > import. Would this approach be interesting to you? It would make a
> nice
> > scalbility testbed...
> >
>
> Yeah, thank you for the advice and I will naturally take it a high
> consideration.
>
> But I was also wondering about the updates of the database. I mean,
> if 10k documents are changed/added each month, is there a way with
> that DB to know which has been updated?
>
> If the answer if no, then we will have to re-index the RDF mapping of
> the MBZ database (how many gigs of SQL? Probably too much ;) ).
> Considering that it should take some days, it would be nice to do
> that another way ;)
>
> But I'll have a better knowledge about that issue later.

http://bugs.musicbrainz.org/browser/mb_server/trunk/INSTALL
is the main doc to look up.

Look at:

- RT_REPLICATED
-- This is a replicated database
install. Once
a data snapshot has been imported, the database
can be kept up to date with running an hourly
data import script.

via http://musicbrainz.org/doc/ServerDownload
http://bugs.musicbrainz.org/browser/mb_server/

ie.
svn checkout http://svn.musicbrainz.org/mb_server/trunk mb_server

If the structural aspects of the generated RDF will evolve based eg on
new relationship types in SQL field values, ... there might be a need
to write some scripts that update the (currently hypothetical) SQL to
RDF mapping configs.

I expect I'll look at SquirrelRDF first, but am enthused to try
Virtuoso as soon as I can get a MacOSX version, or get home to Bristol
to get access to my Linux box.

cheers,

Dan

Frederick Giasson

unread,
Dec 22, 2006, 10:20:51 AM12/22/06
to music-ontology-sp...@googlegroups.com
Hi Dan,

> http://bugs.musicbrainz.org/browser/mb_server/trunk/INSTALL
> is the main doc to look up.
>
> Look at:
>
> - RT_REPLICATED
> -- This is a replicated database
> install. Once
> a data snapshot has been imported, the
> database
> can be kept up to date with running an hourly
> data import script.
>
> via http://musicbrainz.org/doc/ServerDownload
> http://bugs.musicbrainz.org/browser/mb_server/
>
> ie.
> svn checkout http://svn.musicbrainz.org/mb_server/trunk mb_server

Great thank you.

> If the structural aspects of the generated RDF will evolve based eg on
> new relationship types in SQL field values, ... there might be a need
> to write some scripts that update the (currently hypothetical) SQL to
> RDF mapping configs.

Yeah, it is sure that as soon as something is reported as new or modified, we
will have to have a script to translate these modifications into RDF according
to some criterias. But I don't think it is a big problem at the moment.


> I expect I'll look at SquirrelRDF first, but am enthused to try
> Virtuoso as soon as I can get a MacOSX version, or get home to Bristol
> to get access to my Linux box.


Good! I think you will not be disappointed by Virtuoso if you have the change to
try it! But what is great with Virtuoso is that it is not only a triple store
(in fact that feature is somewhat new in the VIrtuoso landscape). You can create
PL procedure to interact with both SQL and SPARQL and making that procedure a
web service in a couple of click. This is only one example between many others
:) But I think that Kingsley already sold you the product, didn't he? :)

Take care,


Fred

Frederick Giasson

unread,
Dec 22, 2006, 11:56:19 AM12/22/06
to music-ontology-sp...@googlegroups.com
Hi,


I just tagged what is currently under revision into the spec documentation of
the ontology.


An email about these revision should be sent later today for review be peers.


Salutations,


Fred

Brendan Quinn

unread,
Dec 26, 2006, 7:06:20 PM12/26/06
to Music Ontology Specification Group
On Dec 22, 3:36 am, Frederick Giasson <f...@fgiasson.com> wrote:
> > Also the BBC Creative Archive database (currently offline still I
> > think) used bits of FOAF and had mention of a lot of artists, cos it
> > covered radio etc.http://www.hackdiary.com/archives/000071.htmlYeah, I think the project is in revision after the beta phase, at

> least I think I read that somewhere.

Hi Fred, Dan and all,

FWIW, the BBC Open Archive programme catalogue database (not the
creative archive, that's a different archive ;-) should be back up now:

http://open.bbc.co.uk/catalogue/xml/contributor/4557
is an example FOAF page, for David Bowie.

in the programme catalogue, mattb also used foaf:isPrimaryTopicOf to
point to wikipedia. Not that that's necessarily the best way, or to say
that the XML format isn't going to change in the future... the site is
still an "experimental prototype", it's only back up because the rights
issues have been sorted out now! no code has changed.

brendan.

Frederick Giasson

unread,
Dec 29, 2006, 3:46:21 PM12/29/06
to music-ontology-sp...@googlegroups.com
Hi Mr. Quinn,

> FWIW, the BBC Open Archive programme catalogue database (not the
> creative archive, that's a different archive ;-) should be back up now:
>
> http://open.bbc.co.uk/catalogue/xml/contributor/4557
> is an example FOAF page, for David Bowie.


Wow, this is great news.


> in the programme catalogue, mattb also used foaf:isPrimaryTopicOf to
> point to wikipedia. Not that that's necessarily the best way, or to say
> that the XML format isn't going to change in the future... the site is
> still an "experimental prototype", it's only back up because the rights
> issues have been sorted out now! no code has changed.


Okay, I'll take a deeper look at how the information is linked into the archive
at the beginning of 2007.


BTW, are you involved (or know someone) into that project?


Take care and happy new year.


Fred

Reply all
Reply to author
Forward
0 new messages