Meaning of a Tag - the Moat Project

31 views
Skip to first unread message

Paul Lamere

unread,
Aug 14, 2008, 7:03:09 AM8/14/08
to APML.Public.General
In the spirt of "invent nothing" I suggest we take a look at the moat-
project as a method of assigning meaning to concepts using URIs of
semantic web resources. The MOAT framework consists of a lightweight
ontology, used to represent different objects (Tag, Meanings …) and
let services (clients and server) exchange information each others; a
MOAT server that stores the different meanings (i.e. Semantic Web
resources URIs) of tags that can be queried and updated by users; and
MOAT clients, that interact with a server to let users easily annotate
their content with those URIs.

More info is on the moat-project site at http://moat-project.org/

Also there is a good slide deck that gives a high level view of MOAT
here:

http://apassant.net/blog/2008/07/23/social-music-meets-the-semantic-web/

The MOAT stuff starts at slide 42.

Paul Jones

unread,
Aug 14, 2008, 7:05:07 AM8/14/08
to apml-...@googlegroups.com
Hi Paul,

Given that you obviously have some familiarity of the project, would you be able to post a summary here? And how you might envision it being used?

Thanks,
Paul.

Paul Lamere

unread,
Aug 14, 2008, 10:07:21 AM8/14/08
to APML.Public.General
I'll have a crack at a summary but I'm certainly no MOAT expert.

The goal of the MOAT project is to provide a framework to give some
mechanism to attach machine-understandable meaning to tags.

The classic example is a resource that is tagged with 'apple'. A
human may be able to tell from context that the 'apple' tag is
referring to the fruit and not the computer company or the record
label (or NYC for that matter) - but the ambiguity is hard for
machines to resolve. With MOAT, the tag 'apple' would be replaced
with a URI such as:

http://dbpedia.org/resource/Apple (for the fruit)
http://dbpedia.org/resource/Apple_inc. (for the company)
http://dbpedia.org/resource/Apple_records (for the label)

This helps resolve problems of ambiguity/polysemy that plague tagging
systems. MOAT does this by using URIs to resources (such as dbpedia,
dbtunes, geonames etc). Now of course, when someone tags an item
with 'apple' they don't want to have to remember and type
http://dbpedia.org/resource/Apple_inc - so MOAT defines a client
protocol that allows a client to help the user attach the proper
tag. Using the protocol the client can retrieve potential tag
meanings (which version of 'apple') or let the user add new meaning
for the tag if the proper meaning has not been found.

MOAT defines a lightweight ontology to represent how different
meanings can be related to a tag. A site like Last.fm could provide
a MOAT server that uses this ontology to disambiguate a tag like
'progressive' into more precise URIs such as 'progressive metal',
'progressive jazz' and 'progressive rock'.

APML concepts are very similar to tags (often times they
are derived directly from social tags). Representing the concepts
as URIs (as 'rdf:about' property does in the APML 1.0 draft spec)
will help disambiguate tags and allow for machine processing of the
information. However, I worry that there will be little consensus
as to what URIs to use here resulting in the very interoperability
problems that we are trying so hard to avoid. If last.fm publishes
an apml concept for heavy metal like so:
rdf:about="http://last.fm/tags/heavy+metal" and MyStrands does it
like so: rdf:about="http://strands.com/tags/heavy+metal" we are
back to the problem where we don't know for sure if Last.fm and
MyStrands mean the same thing when they indicate that I have a
preference for 'heavy metal'. What we'd really like is for both
Last.fm and Mystrands to agree on a URI such as
rdf:about="http://musicbrainz.org/tags/heavy+metal". Anything we
can do to encourage this will enhance the usefulness of an APML
profile - I will be able to take my Last.fm taste data and get
recommendations from MyStrands, MatchMine or whatever. I suggest
that MOAT may be a good way to help arrive at a URIs that are
interoperable. If both Last.fm and MyStrands used MOAT resolve
tags to URIs then the taste data from both entities would be
compatible.

I don't think the APML spec needs to address where the URIs come
from - a URI is just a URI - but I do think it would improve
portability of APML if we offer guidance to implementers as to how
to arrive at URIs that will maximize interoperability. Since the
MOAT-Project is trying to address this very issue I suggest that
that we at least keep abreast of what the MOAT-Project is doing.

Some caveats: MOAT is new and not widely adopted - perhaps even
less mature than APML so we don't want to attach our wagon just to
this horse, but I have heard of at least one large media corporation
that is looking hard at MOAT.


TSchultz55

unread,
Aug 14, 2008, 10:40:47 AM8/14/08
to APML.Public.General
Paul L.,

Great to see you back lurking the APML discussions!

RE:
> information. However, I worry that there will be little consensus
> as to what URIs to use here resulting in the very interoperability
> problems that we are trying so hard to avoid. If last.fm publishes
> an apml concept for heavy metal like so:
> rdf:about="http://last.fm/tags/heavy+metal" and MyStrands does it
> like so: rdf:about="http://strands.com/tags/heavy+metal" we are
> back to the problem where we don't know for sure if Last.fm and
> MyStrands mean the same thing when they indicate that I have a
> preference for 'heavy metal'.

In a perfect world, rdf:seeAlso could be used to overcome this
problem. But that's up to the individual developers how they choose
to do this if at all.

RE:
> I don't think the APML spec needs to address where the URIs come
> from - a URI is just a URI - but I do think it would improve
> portability of APML if we offer guidance to implementers as to how
> to arrive at URIs that will maximize interoperability. Since the
> MOAT-Project is trying to address this very issue I suggest that
> that we at least keep abreast of what the MOAT-Project is doing.

Could the APML API libraries come with "built-in" functionality for
arriving to these URIs? So, instead of APML libs being just XML
serializers/deserializers, they would also contain functionality that
would allow for the negotiation of concept URIs from DBPedia, MOAT,
etc.

I'm throwing a screencast together (hopefully) this weekend to review
and demo this stuff, as it might be helpful. Working on a personal
side-project (CodeIgniter) that does just this and wouldn't mind
opening up that part of the code if it could help.

Cheers,

Tim


On Aug 14, 10:07 am, Paul Lamere <paul.lam...@gmail.com> wrote:
> I'll have a crack at a summary but I'm certainly no MOAT expert.
>
> The goal of the MOAT project is to provide a framework to give some
> mechanism to attach machine-understandable meaning to tags.
>
> The classic example is a resource that is tagged with 'apple'.  A
> human may be able to tell from context that the 'apple' tag is
> referring to the fruit and not the computer company or the record
> label (or NYC for that matter) - but the ambiguity is hard for
> machines to resolve.  With MOAT, the tag 'apple' would be replaced
> with a URI such as:
>
> http://dbpedia.org/resource/Apple          (for the fruit)http://dbpedia.org/resource/Apple_inc.      (for the company)http://dbpedia.org/resource/Apple_records  (for the label)
>
> This helps resolve problems of ambiguity/polysemy that plague tagging
> systems. MOAT does this by using URIs to resources (such as dbpedia,
> dbtunes, geonames etc).  Now of course, when someone tags an item
> with 'apple' they don't want to have to remember and typehttp://dbpedia.org/resource/Apple_inc- so MOAT defines a client

Chris Saad

unread,
Aug 14, 2008, 1:59:28 PM8/14/08
to apml-...@googlegroups.com
Consider also that, to date, we have found it effective to remember that the APML profile should be taken as a set of multiple data points to 'triangulate' a user's interest. With this framing it becomes slightly less important if "Apple" means fruit or the company because the other concepts should reveal either "Iphone" or "Bananas".

The result is clear if considering the whole cloud.

That being said I am not arguing against the *option* of concrete semantic links - I just want to encourage everyone not to go down a rabbit hole with it (in the spirit of KISS).

Chris
--
Chris Saad

FaradayMedia - For Audiences of One
Particls - Are You Paying Attention?
Engagd - The Open Attention Platform
Media 2.0 Workgroup - Social, Democratic, Distributed
APML - Your Attention Profile
DataPortability - Connect, Control, Share, Remix

Mason Lee

unread,
Aug 14, 2008, 2:33:44 PM8/14/08
to apml-...@googlegroups.com, apml-...@googlegroups.com
To help along these lines, apml could recommend a "neutral" uri namespace for simple "social tags".  E.g.
urn:apml-org:apple

That way you get simple social tag support AND everything is still a URI.

-mason

J. Trent Adams

unread,
Aug 14, 2008, 3:16:38 PM8/14/08
to APML.Public.General
I'm a fan of more fully exploring the MOAT approach, specifically as
it relates to evaluating our eventual approach to something like RDF.
As a company trying to develop against APML as in input, we've found
it difficult to infer meaning from the "cloud" with any degree of
certainty. This is at least in relation to the data points we
generate which have much tighter semantic meaning.

While I completely understand the KISS approach, we're running into
the fact that importing APML as it stands (ie. without universal
context) measurably decreases our confidence in identifying the user's
tastes and interests. This problem could go away with enough data
points, but we're not there, yet.

Any steps we can collectively make toward disambiguating the concepts
earlier within APML will be valuable. As it stands right now, we have
no problem producing APML from our datasets, it's just that we can't
support consuming it until there's more context (or we have orders of
magnitude more data for disambiguation).

Anyone else in the same boat?

- Trent


On Aug 14, 2:33 pm, Mason Lee <mason....@gmail.com> wrote:
> To help along these lines, apml could recommend a "neutral" uri  
> namespace for simple "social tags".  E.g.
> urn:apml-org:apple
>
> That way you get simple social tag support AND everything is still a  
> URI.
>
> -mason
>
> On Aug 14, 2008, at 10:59 AM, "Chris Saad" <chris.s...@gmail.com> wrote:
>
> > Consider also that, to date, we have found it effective to remember  
> > that the APML profile should be taken as a set of multiple data  
> > points to 'triangulate' a user's interest. With this framing it  
> > becomes slightly less important if "Apple" means fruit or the  
> > company because the other concepts should reveal either "Iphone" or  
> > "Bananas".
>
> > The result is clear if considering the whole cloud.
>
> > That being said I am not arguing against the *option* of concrete  
> > semantic links - I just want to encourage everyone not to go down a  
> > rabbit hole with it (in the spirit of KISS).
>
> > Chris
>
> > On Thu, Aug 14, 2008 at 7:40 AM, TSchultz55 <TSchult...@gmail.com>  

Phil Barker

unread,
Aug 15, 2008, 4:51:05 AM8/15/08
to apml-...@googlegroups.com
I don't understand how this, on its own, helps. If the URI is not
resolvable against an ontology, formal or informal, i.e. if I can't look
up the URI and get something semantically specific whether it be, e.g.,
a wikipedia article or an entry a SKOS encoded vocabulary or whatever,
then I still don't know what you mean by apple. Are you suggesting
apml.org maintain a vocabulary/identifier service that will disambiguate
Apple along the lines of supplying identifiers urn:apml-org:apple,
urn:apml-org:apple_inc urn:apml-org:apple_records? Or am missing something?

Phil.

Mason Lee wrote:
> To help along these lines, apml could recommend a "neutral" uri
> namespace for simple "social tags". E.g.
> urn:apml-org:apple
>
> That way you get simple social tag support AND everything is still a URI.
>
> -mason
>
> On Aug 14, 2008, at 10:59 AM, "Chris Saad" <chris...@gmail.com

> <mailto:chris...@gmail.com>> wrote:
>
>> Consider also that, to date, we have found it effective to remember
>> that the APML profile should be taken as a set of multiple data
>> points to 'triangulate' a user's interest. With this framing it
>> becomes slightly less important if "Apple" means fruit or the company
>> because the other concepts should reveal either "Iphone" or "Bananas".
>>
>> The result is clear if considering the whole cloud.
>>
>> That being said I am not arguing against the *option* of concrete
>> semantic links - I just want to encourage everyone not to go down a
>> rabbit hole with it (in the spirit of KISS).
>>
>> Chris
>>
>>
>> On Thu, Aug 14, 2008 at 7:40 AM, TSchultz55 <TSchu...@gmail.com
>> <mailto:TSchu...@gmail.com>> wrote:
>>
>>
>> Paul L.,
>>
>> Great to see you back lurking the APML discussions!
>>
>> RE:
>> > information. However, I worry that there will be little consensus
>> > as to what URIs to use here resulting in the very interoperability
>> > problems that we are trying so hard to avoid. If last.fm

>> <http://last.fm> publishes

>> <http://last.fm> publishes

--
Phil Barker Learning Technology Adviser
ICBL, School of Mathematical and Computer Sciences
Mountbatten Building, Heriot-Watt University,
Edinburgh, EH14 4AS
Tel: 0131 451 3278 Fax: 0131 451 3327
Web: http://www.icbl.hw.ac.uk/~philb/

--
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.

Mason Lee

unread,
Aug 15, 2008, 5:56:57 AM8/15/08
to apml-...@googlegroups.com
Hi Phil, gd, et al.,

This "urn:apml-org:tags:" would not be for disambiguating, it would
only be to make sure that all Concept Keys are URIs, even when
implementers only have a simple tag cloud to work from. There was
some discussion that both URIs and simple "tags" would be supported as
Concept Keys. Seems to me that would result in syntactic ambiguity.
(For example, take the string "http://foo.org". Is that a simple tag,
or a URI? Well, it sure *looks* like a URI, but what if it was
actually just a tag on a photo of my browser history? Or what if my
interest is in the URL itself and not the concept represented by the
URL, but all my weak system has to work with are simple tags. If you
require that *everything* have a namespace, or at least use URI
character restrictions, there's no chance for a problem. You know
what's a namespace, and what's a tag.) For this reason I suggested
that all simple tags could be prefixed with a generic urn namespace,
promoted by APML. Nitpicking, sure, but if there's one thing worse
than semantic ambiguity for humans, it's syntactic ambiguity for
computers.

There are a lot of web sites out there that use simple tags and
simple tag clouds, where the tags are just strings, but not
necessarily URIs. The meaning of these tags has to be inferred from
looking at the greater tag cloud surrounding them, or from auxiliary
information, and this is a hard problem no matter what.
Unfortunately, that's a problem that pretty much all of web2.0 has
right now. Flickr knows that I click on photos tagged "apple" a lot,
but they still don't know if I like Apple Records, Apple Inc., or
apples. If Flickr wants to export to APML the fact that I like the
tag "apple", they should still be able to do that, without having to
disambiguate. So what "key" do they use? "urn:flickr-tag:apple"? I
guess that's okay, but it's not too portable! "apple"? That's not a
URI.

So my suggestion: APML could recommend that anyone with a generic
ambiguous tag cloud can just use "urn:ampl-org:tags:[their tags]".
APML would *not* provide any disambiguation information about these
URNs, because it is precisely to say "There is no disambiguation
available, all I know is he or she likes this tag". The proposal was
so that all keys should be URIs, and not a mix of URIs and simple tags
that have no namespace, thus guaranteeing that there's no possibility
of syntactic ambiguity. Computers will thank us.

Though I could be missing something, too-- wouldn't be the first
time. :)

--Mason

p.s. maybe a better default urn would be "urn:apml:tagcloud:[x]"

Phil Barker

unread,
Aug 15, 2008, 7:44:50 AM8/15/08
to apml-...@googlegroups.com

Mason Lee wrote:
> Hi Phil, gd, et al.,
>
> This "urn:apml-org:tags:" would not be for disambiguating, it would
> only be to make sure that all Concept Keys are URIs, even when
> implementers only have a simple tag cloud to work from.

OK, I see what you mean now. Thanks.

> There was
> some discussion that both URIs and simple "tags" would be supported as
> Concept Keys. Seems to me that would result in syntactic ambiguity.
> (For example, take the string "http://foo.org". Is that a simple tag,
> or a URI? Well, it sure *looks* like a URI, but what if it was
> actually just a tag on a photo of my browser history? Or what if my
> interest is in the URL itself and not the concept represented by the
> URL, but all my weak system has to work with are simple tags. If you
> require that *everything* have a namespace, or at least use URI
> character restrictions, there's no chance for a problem. You know
> what's a namespace, and what's a tag.) For this reason I suggested
> that all simple tags could be prefixed with a generic urn namespace,
> promoted by APML. Nitpicking, sure, but if there's one thing worse
> than semantic ambiguity for humans, it's syntactic ambiguity for
> computers.
>

Yes, that's a real issue: I pay attention to several wikipedia articles,
I suppose I could refer to them as "Wikipedia's APML article" and the
like, but it's likely they're recorded some places as
http://en.wikipedia.org/wiki/APML and similar.

I think I would say that if you are interested in a website/URL or have
a URI as a label as in your example, then you show that as the key to
the concept:
<Concept key="http://foo.org" ...>
which will be interpreted differently to
<Concept rdf:about="http://foo.org" ...>
Since the semantics of the attributes are different: one is a literal,
just a string a text, probably a label, the other a URI. I think there
is some sense in keeping the distinction.

Stretching the point, it even allows me to distinguish between
<Concept key="http://en.wikipedia.org/wiki/APML"> (interest in the
article),
<Concept key="Wikipedia APML article"
rdf:about="http://en.wikipedia.org/wiki/APML"> (interest in the article)
and
<Concept key="APML" rdf:about="http://en.wikipedia.org/wiki/APML#">
(interest in APML itself)

(the last example assumes someone is using wikipedia's URLs to generate
URIs for the concepts that the articles describe, which may or may not
be a good idea)

> There are a lot of web sites out there that use simple tags and
> simple tag clouds, where the tags are just strings, but not
> necessarily URIs. The meaning of these tags has to be inferred from
> looking at the greater tag cloud surrounding them, or from auxiliary
> information, and this is a hard problem no matter what.
> Unfortunately, that's a problem that pretty much all of web2.0 has
> right now. Flickr knows that I click on photos tagged "apple" a lot,
> but they still don't know if I like Apple Records, Apple Inc., or
> apples. If Flickr wants to export to APML the fact that I like the
> tag "apple", they should still be able to do that, without having to
> disambiguate. So what "key" do they use? "urn:flickr-tag:apple"? I
> guess that's okay, but it's not too portable! "apple"? That's not a
> URI.
>

I think the key and URI have different semantics and support different
usage scenarios and I'm not convinced there is a case for merging them.

Phil

Phil Barker

unread,
Aug 15, 2008, 7:44:19 AM8/15/08
to apml-...@googlegroups.com
Hello all,
I've been lurking here for a while, I think I am beginning to understand
what's going on, but please excuse me if I demonstrate how incomplete
that understanding is.

Chris Saad wrote:
> Consider also that, to date, we have found it effective to remember
> that the APML profile should be taken as a set of multiple data points
> to 'triangulate' a user's interest. With this framing it becomes
> slightly less important if "Apple" means fruit or the company because
> the other concepts should reveal either "Iphone" or "Bananas".
>
> The result is clear if considering the whole cloud.

I think there seem to be two usage scenarios for APML. I don't know if
these are written down anywhere, I haven't really looked for them.
There's the one that Chris describes above, which I would characterize
with there being no semantic processing at the APML provider end, the
hard work has to be done by the APML consumer. However I think there is
an equally important scenario where the APML provider has done semantic
processing and is willing to share the results. The APML spec could save
the APML consumer a lot of work it allows semantically enabled providers
to expose the result of their semantic processing.


>
> That being said I am not arguing against the *option* of concrete
> semantic links

I'm with you there. And I don't think it should be a problem, I imagine
that any semantically aware APML provider will have access to a
vocabulary/thesaurus/ontology which will provide something like what
SKOS calls preferred labels, alternative labels, hidden labels (or
preferred terms and those terms to which they have a UF [UseFor]
relationship if you prefer zThes / ISO 2788 & ISO 5964) ... in other
words it will almost certainly be able to provide a label to use as the
key attribute and probably be able to give you more than one term/label
for each concept (e.g. dogs, hounds, perros, chiens) which might be
helpful. (Discuss?)

One consequence of this line of thinking is that the (optional) semantic
link must reference a shared ontology. [Or I suppose a compulsory
URI-key should optionally be resolvable to a shared ontology -- but I
think the key and rdf:about attributes are both necessary options since
they have different semantics and are for different purposes].

A slight aside: there is a danger of trying to ship an entire ontology
in the APML file, which I think would be a mistake in terms of "scope
creep". Perhaps clarity on where we expect the semantics to be derived
and how we expect them to be communicated would help limit that.

> - I just want to encourage everyone not to go down a rabbit hole with
> it (in the spirit of KISS).

I'm all for KISS, who wouldn't be, but it's sometimes worth remembering
that not everything is simple.
Perhaps "Make Something Simpler" would be a better slogan for spec
development :-).

All the best, Phil

J. Trent Adams

unread,
Aug 15, 2008, 11:24:08 AM8/15/08
to APML.Public.General

Much good discussion going on, and it's interesting to see the
clarification of everyone's thinking. In fact, I think Phil Barker's
recent note [1] succinctly captures the current discussion. His two
cases hit the nail on the head:

A. Context free APML = burden on consumers
B. Added APML context = burden on providers

The trade-offs appear to be that option A is easier to produce, while
option B delivers higher value to the APML consumer.

Speaking to Chris's point [2], we have found that with enough data
points it's possible to statistically infer a degree of aligned
interests across context-free tags (and associated weight definitely
helps). The trouble is, however, that in most real-world uses we've
explored, there simply aren't enough data points. At least this is
true for a generalized solution outside a single-purpose system.
Further, the cycles it takes to converge on the meaningful inferences
is computationally expensive for anything nearing real time utility
for millions of users across even just a score of sites.

Basically, what we've found is that even a moderate level of
reasonably confident disambiguation information in preference data
significantly improves the precision and recall (as well as other
metrics) within most standard recommendation models. This is
precisely why we had to "roll our own" personal preference model
within a semantically controlled environment. Extending beyond our
own universe of discourse, however, we'd love to allow users to feed
their preferences with outside APML sources.

That being said... I don't believe RDF:about support alone will be
enough without encouraging something like MOAT as described by Paul
Lamere [3]. As others have pointed out, with unbounded RDF:about
declarations APML consumers need to be aware of all the associated URI
endpoints.

Without something like MOAT, it's the predicate labeling itself rather
than the endpoints that are of the most value in the short run.
Later, when we're evaluating deeper semantics and building an
authority index, the RDF:about URIs increase in value.

For us, this isn't a theoretical exercise looking for a perfect
solution. We have a handful of programmers and statistical scientists
coding against this type of model right now. So, while I totally
agree with the basic principle of KISS, without more semantic meaning
APML doesn't reach the minimum acceptable standards we require as a
reliable input format.

Phil had another great point speaking to this:

> I'm all for KISS, who wouldn't be, but it's sometimes
> worth remembering that not everything is simple.

It's simple to blindly convert tags into outbound APML concepts, but
from our experience it's not simple consuming them without more
semantic guidance.

- Trent


[1] http://groups.google.com/group/apml-public/msg/bc9ba3f2c9f4d525?hl=en
[2] http://groups.google.com/group/apml-public/msg/5563db420ea11dba?hl=en
[3] http://groups.google.com/group/apml-public/msg/952b2eb6f4956d5a?hl=en


---
J. Trent Adams
=jtrentadams

About: http://www.mediaslate.org/jtrentadams/
Follow: http://twitter.com/jtrentadams
Reply all
Reply to author
Forward
0 new messages