MusicBrainz URI syntax

50 views
Skip to first unread message

Kurt J

unread,
Aug 19, 2010, 11:10:21 AM8/19/10
to music-ontology-sp...@googlegroups.com
Hello,

In the MusicBrainz NGS server, the HTML page about a resource is
served from a URI of the form

http://musicbrainz.org/<type>/<mbid>

Note this is in contrast to the current server which uses a ".html"
suffix. This means to be linked dataish we need to add a hash on the
end of the HTML URI to make URIs for artists, works, recordings,
tracks, releases, release groups, labels, ars, etc.

BBC music has a similar setup and uses a "#artist" suffix. If we
adopt this same URI syntax MusicBrainz URIs would become for example:

http://musicbrainz.org/artist/20ff3303-4fe2-4a47-a1b6-291e26aa3438#artist

Note that the word "artist" appears twice in this URI. Also for a
release group, it becomes even longer

http://musicbrainz.org/release-group/8902cb4d-444c-3bdf-ac22-b710ec0b65ae#release-group

I purpose we borrow an idea from Perl syntax and replace "#artist" or
"#release-group" simply with "#_"

http://musicbrainz.org/artist/20ff3303-4fe2-4a47-a1b6-291e26aa3438#_

http://musicbrainz.org/release-group/8902cb4d-444c-3bdf-ac22-b710ec0b65ae#_

Do you love it? Hate it? Alternative ideas? Do you prefer
"#release-group"? Weigh in now or forever hold your peace.

-Kurt J

Bob Ferris

unread,
Aug 19, 2010, 1:06:34 PM8/19/10
to music-ontology-sp...@googlegroups.com
Hi Kurt,

I don't really understand, why do you really need the hash the end? If
e.g. http://musicbrainz.org/artist/20ff3303-4fe2-4a47-a1b6-291e26aa3438
should describe a specific music artist with help of Semantic Graphs,
then I have he following triple to start in my mind

http://musicbrainz.org/artist/20ff3303-4fe2-4a47-a1b6-291e26aa3438 a
mo:MusicArtist ;

....

and then I can continue to model this graph, which represents a
description of a specific musician. I don't see there any need, why the
category should be used twice here. Also the first mentioning of the
category is only for human readability, but the machines wouldn't really
need that information in the URI. We can also use simply

http://musicbrainz.org/<mbid>

However, it's okay to include the category somewhere in the URI.

Cheers,


Bob

PS: For example
http://musicbrainz.org/artist/20ff3303-4fe2-4a47-a1b6-291e26aa3438#tracks would
make some sense in my mind, it could direct directly to all tracks of
that artist.

Ian Davis

unread,
Aug 19, 2010, 1:09:33 PM8/19/10
to music-ontology-sp...@googlegroups.com
I would prefer a generic fragment e.g. #thing, #it, #self, #resource
or whatever.

Are the mbid's scoped to the entity or are they globally unique? If
the latter then i would prefer a URI like:

http://id.musicbrainz.org/mbid which 303 redirects to the right html/rdf pages

That way the URIs would be stable even if the entity names change
(e.g. release is renamed to record in some future iteration of the
database schema)

> --
> You received this message because you are subscribed to the Google Groups "Music Ontology Specification Group" group.
> To post to this group, send email to music-ontology-sp...@googlegroups.com.
> To unsubscribe from this group, send email to music-ontology-specific...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/music-ontology-specification-group?hl=en.
>
>

Ian Davis

unread,
Aug 19, 2010, 1:16:31 PM8/19/10
to music-ontology-sp...@googlegroups.com
I would prefer a generic fragment e.g. #thing, #it, #self, #resource
or whatever.

Are the mbid's scoped to the entity or are they globally unique? If
the latter then i would prefer a URI like:

http://id.musicbrainz.org/mbid which 303 redirects to the right html/rdf pages

That way the URIs would be stable even if the entity names change
(e.g. release is renamed to record in some future iteration of the
database schema)

On Thursday, August 19, 2010, Kurt J <kur...@gmail.com> wrote:

Kurt J

unread,
Aug 19, 2010, 1:38:06 PM8/19/10
to music-ontology-sp...@googlegroups.com
Hi Ian,

> I would prefer a generic fragment e.g. #thing, #it, #self, #resource
> or whatever.
>
> Are the mbid's scoped to the entity or are they globally unique? If
> the latter then i would prefer a URI like:
>
> http://id.musicbrainz.org/mbid which 303 redirects to the right html/rdf pages

actually that's a pretty nice idea. the mbids are globally unique -
they are UUIDs - univerally unique ids - so there is some
astronomically low chance of a collision. but that's not really
affected by including or excluding the type.

> That way the URIs would be stable even if the entity names change
> (e.g. release is renamed to record in some future iteration of the
> database schema)

right. this does seem cleaner.

> I don't really understand, why do you really need the hash the end? If e.g.
> http://musicbrainz.org/artist/20ff3303-4fe2-4a47-a1b6-291e26aa3438 should
> describe a specific music artist with help of Semantic Graphs, then I have
> he following triple to start in my mind

Bob, the problem is, in the MB architecture that's already built, the
HTML is served from that URI. I could try to talk all the MB dev team
to going back to ".html" or adding "#about" and content neg but i
don't think that'd be a popular idea. It's the document vs. thing
issue...

We don't have to commit right away, so lets let this discussion stew
for a bit :-)

-Kurt J

Kurt J

unread,
Aug 19, 2010, 2:02:39 PM8/19/10
to music-ontology-sp...@googlegroups.com
On Thu, Aug 19, 2010 at 12:38 PM, Kurt J <kur...@gmail.com> wrote:
> Hi Ian,
>
>> I would prefer a generic fragment e.g. #thing, #it, #self, #resource
>> or whatever.
>>
>> Are the mbid's scoped to the entity or are they globally unique? If
>> the latter then i would prefer a URI like:
>>
>> http://id.musicbrainz.org/mbid which 303 redirects to the right html/rdf pages
>
> actually that's a pretty nice idea.  the mbids are globally unique -
> they are UUIDs - univerally unique ids - so there is some
> astronomically low chance of a collision.  but that's not really
> affected by including or excluding the type.

oh whoops, one problem. actually since there's a one-to-one
correspondence between track and recording and a recording is sort of
the "new" track, they share the same MBID but arguably have a subtle
difference...

Kurt J

unread,
Aug 19, 2010, 2:07:13 PM8/19/10
to music-ontology-sp...@googlegroups.com
On Thu, Aug 19, 2010 at 1:02 PM, Kurt J <kur...@gmail.com> wrote:
> On Thu, Aug 19, 2010 at 12:38 PM, Kurt J <kur...@gmail.com> wrote:
>> Hi Ian,
>>
>>> I would prefer a generic fragment e.g. #thing, #it, #self, #resource
>>> or whatever.
>>>
>>> Are the mbid's scoped to the entity or are they globally unique? If
>>> the latter then i would prefer a URI like:
>>>
>>> http://id.musicbrainz.org/mbid which 303 redirects to the right html/rdf pages
>>
>> actually that's a pretty nice idea.  the mbids are globally unique -
>> they are UUIDs - univerally unique ids - so there is some
>> astronomically low chance of a collision.  but that's not really
>> affected by including or excluding the type.
>
> oh whoops, one problem.  actually since there's a one-to-one
> correspondence between track and recording and a recording is sort of
> the "new" track, they share the same MBID but arguably have a subtle
> difference...
>

and another problem, if we implement RDFa first and a full RDF from
RDB wrapper later, using these sub-domain URIs in the RDFa before they
actually de-reference is perhaps bad form. A hash URI doesn't suffer
from this because the RDFa is sorta the dereferencing. Or is my
thinking on this issue mixed up?

Ian Davis

unread,
Aug 19, 2010, 2:24:43 PM8/19/10
to music-ontology-sp...@googlegroups.com
On Thu, Aug 19, 2010 at 7:07 PM, Kurt J <kur...@gmail.com> wrote:
> and another problem, if we implement RDFa first and a full RDF from
> RDB wrapper later, using these sub-domain URIs in the RDFa before they
> actually de-reference is perhaps bad form.  A hash URI doesn't suffer
> from this because the RDFa is sorta the dereferencing.  Or is my
> thinking on this issue mixed up?
>

I'm not sure that's the best reason to base an architectural decision
on. Having stable identifiers for the long term would be very
valuable.

I don't think the redirect would be that hard to implement over the
current database. Grab the mbid and look up in the various database
tables to find its type and map it to a web page using the current
logic.

Ian

Ian Davis

unread,
Aug 19, 2010, 2:29:43 PM8/19/10
to music-ontology-sp...@googlegroups.com
On Thu, Aug 19, 2010 at 7:02 PM, Kurt J <kur...@gmail.com> wrote:
>
> oh whoops, one problem.  actually since there's a one-to-one
> correspondence between track and recording and a recording is sort of
> the "new" track, they share the same MBID but arguably have a subtle
> difference...
>

Yes, that is a problem. So, given a mbid, it's not possible to
determine whether that is a track or recording?

This would be a problem for services like spotify that hold the mbid.
They also appear to hold the URL of the mb page which would be broken
by the change in url structure that prompted this thread? See
http://developer.spotify.com/en/metadata-api/search/artist/

That said, I have crawled the spotify API and I found zero mbids in practice :)

Ian

Kurt J

unread,
Aug 19, 2010, 3:25:33 PM8/19/10
to music-ontology-sp...@googlegroups.com
On Thu, Aug 19, 2010 at 1:29 PM, Ian Davis <m...@iandavis.com> wrote:
> On Thu, Aug 19, 2010 at 7:02 PM, Kurt J <kur...@gmail.com> wrote:
>>
>> oh whoops, one problem.  actually since there's a one-to-one
>> correspondence between track and recording and a recording is sort of
>> the "new" track, they share the same MBID but arguably have a subtle
>> difference...
>>
>
> Yes, that is a problem. So, given a mbid, it's not possible to
> determine whether that is a track or recording?

yes, impossible. actually one MBID represents both a track and a
recording. in the NGS the track is just a sort-of abstraction for
placing a recording on a release.

> This would be a problem for services like spotify that hold the mbid.
> They also appear to hold the URL of the mb page which would be broken
> by the change in url structure that prompted this thread? See
> http://developer.spotify.com/en/metadata-api/search/artist/

currently the test server at test.musicbrainz.org does not handle the
'.html' suffix but i'm told on IRC freenode #musicbrainz-devel that a
redirect will be implemented to prevent link rot. also
http://musicbrainz.org/track/<mbid> will apparently redirect to
http://musicbrainz.org/recording/<mbid>

> That said, I have crawled the spotify API and I found zero mbids in practice :)

really? zero mbids in spotify!?

Yves Raimond

unread,
Aug 19, 2010, 4:16:19 PM8/19/10
to music-ontology-sp...@googlegroups.com

Hello!

I am definitely in favor of # uris. No 303 (redirections have a cost), and much cleaner imho - let's not mint another / uri for all mbz resources...

I quite like the #_ idea, but maybe #this, #self or #artist would be more understandable. I like #artist as it is quite self-descriptive  - the mbz uris are already as long as my arm, so no need to worry about length, i would think :)

Cheers,
y

Kurt J

unread,
Aug 19, 2010, 4:53:31 PM8/19/10
to music-ontology-sp...@googlegroups.com
Hello!

Welcome back?

On Thu, Aug 19, 2010 at 3:16 PM, Yves Raimond <yves.r...@gmail.com> wrote:
> Hello!
>
> I am definitely in favor of # uris. No 303 (redirections have a cost), and
> much cleaner imho - let's not mint another / uri for all mbz resources...

actually that's a good point too. a whole other set would be a bit of
a step back in some sense.

> I quite like the #_ idea, but maybe #this, #self or #artist would be more
> understandable. I like #artist as it is quite self-descriptive  - the mbz
> uris are already as long as my arm, so no need to worry about length, i
> would think :)

true they're already long but having 'artist' or 'release-group' twice
in the URI doesn't sit well with me. what about "#thing" - perhaps
that gives us the most clarity here. perhaps "#_" would leave too
many would-be users saying "wtf?" while "#thing" would invoke thoughts
of the "document v. thing" dichotomy.

-kurt j

Alexandre Passant

unread,
Aug 19, 2010, 4:56:59 PM8/19/10
to music-ontology-sp...@googlegroups.com
Hi,

On 19 Aug 2010, at 21:53, Kurt J wrote:

> Hello!
>
> Welcome back?
>
> On Thu, Aug 19, 2010 at 3:16 PM, Yves Raimond <yves.r...@gmail.com> wrote:
>> Hello!
>>
>> I am definitely in favor of # uris. No 303 (redirections have a cost), and
>> much cleaner imho - let's not mint another / uri for all mbz resources...
>
> actually that's a good point too. a whole other set would be a bit of
> a step back in some sense.
>
>> I quite like the #_ idea, but maybe #this, #self or #artist would be more
>> understandable. I like #artist as it is quite self-descriptive - the mbz
>> uris are already as long as my arm, so no need to worry about length, i
>> would think :)
>
> true they're already long but having 'artist' or 'release-group' twice
> in the URI doesn't sit well with me. what about "#thing" - perhaps
> that gives us the most clarity here. perhaps "#_" would leave too
> many would-be users saying "wtf?" while "#thing" would invoke thoughts
> of the "document v. thing" dichotomy.

What about a simple #id approach ?
Short, but less obscure than #_

Alex.

--
Dr. Alexandre Passant
Digital Enterprise Research Institute
National University of Ireland, Galway
:me owl:sameAs <http://apassant.net/alex> .


Kurt J

unread,
Aug 19, 2010, 5:56:56 PM8/19/10
to music-ontology-sp...@googlegroups.com
On Thu, Aug 19, 2010 at 3:56 PM, Alexandre Passant
<alexandr...@deri.org> wrote:
> Hi,
>
> On 19 Aug 2010, at 21:53, Kurt J wrote:
>
>> Hello!
>>
>> Welcome back?
>>
>> On Thu, Aug 19, 2010 at 3:16 PM, Yves Raimond <yves.r...@gmail.com> wrote:
>>> Hello!
>>>
>>> I am definitely in favor of # uris. No 303 (redirections have a cost), and
>>> much cleaner imho - let's not mint another / uri for all mbz resources...
>>
>> actually that's a good point too.  a whole other set would be a bit of
>> a step back in some sense.
>>
>>> I quite like the #_ idea, but maybe #this, #self or #artist would be more
>>> understandable. I like #artist as it is quite self-descriptive  - the mbz
>>> uris are already as long as my arm, so no need to worry about length, i
>>> would think :)
>>
>> true they're already long but having 'artist' or 'release-group' twice
>> in the URI doesn't sit well with me.  what about "#thing" - perhaps
>> that gives us the most clarity here.  perhaps "#_" would leave too
>> many would-be users saying "wtf?" while "#thing" would invoke thoughts
>> of the "document v. thing" dichotomy.
>
> What about a simple #id approach ?
> Short, but less obscure than #_

yeah i like "#id"

Nicholas J Humfrey

unread,
Aug 20, 2010, 4:38:31 AM8/20/10
to music-ontology-sp...@googlegroups.com

> BBC music has a similar setup and uses a "#artist" suffix. If we
> adopt this same URI syntax MusicBrainz URIs would become for example:
>
> http://musicbrainz.org/artist/20ff3303-4fe2-4a47-a1b6-291e26aa3438#artist
>
> Note that the word "artist" appears twice in this URI. Also for a
> release group, it becomes even longer
>


(Unsurprisingly!) this is what I would very much prefer. Perhaps something to think about is if there was an RDFa version too. I think '#artist' would make more sense in that case. Having id='id' doesn't make as much sense. As Yves said, the URIs are already very long anyway...

I am opposed to id.musicbrainz.org because of the additional HTTP 303 request which can't even be optimised with a Keep-Alive.


nick.

Ian Davis

unread,
Aug 20, 2010, 4:55:34 AM8/20/10
to music-ontology-sp...@googlegroups.com
On Fri, Aug 20, 2010 at 9:38 AM, Nicholas J Humfrey <n...@aelius.com> wrote:
> (Unsurprisingly!) this is what I would very much prefer. Perhaps something to think about is if there was an RDFa version too. I think '#artist' would make more sense in that case. Having id='id' doesn't make as much sense. As Yves said, the URIs are already very long anyway...
>
> I am opposed to id.musicbrainz.org because of the additional HTTP 303 request which can't even be optimised with a Keep-Alive.

I don't think that's an issue. Within the site you would just link
HTML pages to HTML pages. I do this on e.g.
http://semanticlibrary.org/people/isaac-asimov.html so there are no
redirects for standard web browsing. Philosophically I don't see why
HTML should be linking to real-world things in anchor tags, that's
what RDF is for.

The suggested id.musicbrainz.org URIs would be a hub redirecting to
the right data for the occasions when the identifiers are
dereferenced. The main advantage is I think you could guarantee their
persistence longer than the current URL structure (which is already
changing because the db schema and modelling has changed). It's like
using a PURL server to manage stable identifiers.

Ian

Bob Ferris

unread,
Aug 20, 2010, 5:37:13 AM8/20/10
to music-ontology-sp...@googlegroups.com

After the long discussion on #swig yesterday about that issue (see here
[1]), I still think that it is also possible to serve directly a, e.g.
music artist description, on

http://musicbrainz.org/artist/<mbid>

The HTML representation of that description could be entailed into a
document, which will be delivered at

http://musicbrainz.org/artist/<mbid>.html

a N3 representation of that description could be entailed into a
document, which will be delivered at

http://musicbrainz.org/artist/<mbid>.n3

and so on ...

That means that the content negotiation on

http://musicbrainz.org/artist/<mbid>

should deliver directly a serialisation of the music artist description,
e.g. as is it is represented by the RDF graph in this document here[2]
(but also for other representation formats).

And the documents of the different representation format, which include
this music artist description, could then look like this[3] (for
RDF/XML) or this[4] (for text/html).

The important thing is for me that an application (a machine), which
likes to consume that information (here the music artist description),
isn't really interested in the document entailment of that description.
It will simple consume the music artist description itself (the Semantic
Graph, whatever). So I wouldn't really need that document entailment
here, this is more or less only for the "view" part (and not needed by
the model itself). That means, when the information is processed for
human readability.

Don't hesitate to ask, if you need further explanation of the issue I
like to address here.

Cheers,


Bob

PS: The code of the examples is always based on [5]

[1] http://chatlogs.planetrdf.com/swig/2010-08-19.html#T18-06-51
[2] http://smiy.sourceforge.net/musicbrainz/arcade_fire.rdf
[3]
http://smiy.sourceforge.net/musicbrainz/arcade_fire_desc_-_in_document.rdf
[4]
http://smiy.sourceforge.net/musicbrainz/arcade_fire_desc_-_in_document.html
[5] http://www.bbc.co.uk/music/artists/52074ba6-e495-4ef3-9bb4-0703888a9f68

Kurt J

unread,
Aug 20, 2010, 12:19:01 PM8/20/10
to music-ontology-sp...@googlegroups.com
Hi Bob

yes, i agree this is perhaps the cleanest solution. but that means we
have to convince all the MB developers to _change_ their software back
to serving HTML from such a URI. The system you describe is great,
but it simply is not an option unless we convince the MB dev team to
change.

Bob Ferris

unread,
Aug 21, 2010, 7:13:06 AM8/21/10
to music-ontology-sp...@googlegroups.com
>>c

>
> yes, i agree this is perhaps the cleanest solution. but that means we
> have to convince all the MB developers to _change_ their software back
> to serving HTML from such a URI. The system you describe is great,
> but it simply is not an option unless we convince the MB dev team to
> change.
>

While, we are thinking all the time about the design of URLs, I got the
feeling that at the end the, this might only be thing, which is more or
less important to machines, but not to the end user.
Okay, today we see the URL in our Web browser (which might be obsolete
in the future). However, we don't really need it, or? In Apps we often
don't see this information. The link itself is see important thing, but
we don't have to show the URL to the user.
A title of the piece of information we are consuming might be enough. To
get a bit more trust into the provider of the information, we can
additionally use an Information Service description, which should
include a kind of trust mechanism. That simply means that people can
more or less be sure that this piece of information is really from that
Information Service and that the description of this Information Service
is also "correct" (authorized).
To sum up, we can create really long and ugly URLs, but we shouldn't
annoy the end user with them ;)

Cheers,


Bob

Kurt J

unread,
Aug 21, 2010, 8:27:55 PM8/21/10
to music-ontology-sp...@googlegroups.com

good point Bob. they're for machines, who care what they look like ^_^

i think some variant of the "#thing" URIs requires the least effort
and fits well with RDFa.

> Cheers,
>
>
> Bob

Ian Davis

unread,
Oct 8, 2010, 9:02:47 AM10/8/10
to music-ontology-sp...@googlegroups.com
Hi Kurt,

On Sun, Aug 22, 2010 at 1:27 AM, Kurt J <kur...@gmail.com> wrote:
>
> i think some variant of the "#thing" URIs requires the least effort
> and fits well with RDFa.
>

Did you make a final decision on the URI patterns?

I looked on the wiki page at

http://wiki.musicbrainz.org/NGS_to_RDF_mappings#URIs_-_Thing_v_Page

Which suggests using #_ as a suffix

However, the examples shown there don't have the .html suffix required
by the mb site.

So, is the plan to include the .html or not?

i.e. should the URI be

http://musicbrainz.org/release/83ab3d86-dc77-4711-972d-d9e5f209f669#_

or

http://musicbrainz.org/release/83ab3d86-dc77-4711-972d-d9e5f209f669.html#_


Ian

Kurt J

unread,
Oct 8, 2010, 10:58:10 AM10/8/10
to music-ontology-sp...@googlegroups.com
Hi Ian,

On Fri, Oct 8, 2010 at 8:02 AM, Ian Davis <m...@iandavis.com> wrote:
> Hi Kurt,
>
> On Sun, Aug 22, 2010 at 1:27 AM, Kurt J <kur...@gmail.com> wrote:
>>
>> i think some variant of the "#thing" URIs requires the least effort
>> and fits well with RDFa.
>>
>
> Did you make a final decision on the URI patterns?
>
> I looked on the wiki page at
>
> http://wiki.musicbrainz.org/NGS_to_RDF_mappings#URIs_-_Thing_v_Page
>
> Which suggests using #_ as a suffix

yes this was voted on in a musicbrainz-devel meeting on IRC. I don't
recall the exact date tho.

> However, the examples shown there don't have the .html suffix required
> by the mb site.
>
> So, is the plan to include the .html or not?

the NGS server totally does away with .html suffix. see with RDFa:

http://test.musicbrainz.org/release/83ab3d86-dc77-4711-972d-d9e5f209f669

this one ^

> or
>
> http://musicbrainz.org/release/83ab3d86-dc77-4711-972d-d9e5f209f669.html#_
>
>

not this one ^

-kurt

> Ian

Ian Davis

unread,
Oct 8, 2010, 11:12:01 AM10/8/10
to music-ontology-sp...@googlegroups.com
On Fri, Oct 8, 2010 at 3:58 PM, Kurt J <kur...@gmail.com> wrote:
>
> the NGS server totally does away with .html suffix.  see with RDFa:
>
> http://test.musicbrainz.org/release/83ab3d86-dc77-4711-972d-d9e5f209f669
>
>> i.e. should the URI be
>>
>> http://musicbrainz.org/release/83ab3d86-dc77-4711-972d-d9e5f209f669#_
>>
>
> this one ^
>

Perfect!

I am doing some correlation work so will be minting URIs in this pattern.

Thanks,

Ian

Reply all
Reply to author
Forward
0 new messages