Defining Locations in APML

1 view
Skip to first unread message

Paul Jones

unread,
Aug 8, 2008, 12:17:25 PM8/8/08
to apml-...@googlegroups.com
Hi,

In draft 1 of the APML 1.0 specification, the element <Location> is made available. The intent of this element is to provide description of places of interest. However, I'm not yet happy with the descriptiveness of this element, and was hoping for some ideas as to how to enhance it.

Currently, the element specifies a longitude/latitude pair. Whilst this accurately locates a position, it does nothing to describe the position. For instance, nothing indicates whether you are talking about the square kilometre in the middle of a country versus the entire country.

Thoughts?

Paul.

TSchultz55

unread,
Aug 8, 2008, 1:32:02 PM8/8/08
to APML.Public.General
Paul,

These are great ideas and certainly things that need to be worked into
APML. I guess my concern is if we "hard-code" these entities into the
specification (i.e. "Location", "Person", etc.), you'll get a lot of:

"Well....I like 'Beer'....where's the 'Beer' entity? Where's the
'Cat' entity? Where's the 'Fruit' entity? Where's the 'Protein'
entity? Where's the 'Scotch' entity?"......etc. etc. etc.

We can encompass EVERYTHING without having to hard-code exactly what
they are with the APML-RDF proposal (by using "cool" URI's) whether
the concept is an animal, a sport, a piece of clothing, a molecular
compound, etc. Let me use this suggestion as an explanation as to why
I personally favor the RDF format........

RE:
> "However, I'm not yet happy with the descriptiveness of this element, and was hoping for some ideas as to how to enhance it."
>"For instance, nothing indicates whether you are talking about the square kilometre in the middle of a country versus the entire country."

And this is exactly the issue with a vanilla XML implementation. The
specification can get completely unwieldy when taking this approach:

<Location name="Philadelphia" lat="##" lng="##" attr1="xxx"
attr2="xxx" attr3="xxx" attr4="xxx" ......... />

as opposed to (in RDF):

<apml:Resource rdf:about="http://dbpedia.org/resource/Philadelphia" />

Point your browser to "http://dbpedia.org/resource/Philadelphia', and
take a look at all the information that can automatically be asserted
from that single line of markup:

- This resource is in fact a city, which is a location.
- A description of what the concept "Philadelphia" is in 20 different
languages (p:abstract)
- The two area codes (telephone) for Philadelphia (p:areaCode)
- The land size of Philadelphia (p:areaMetroSqMi)
- A photo collection of "Philadelphia" (p:hasPhotoCollection)
- The lat and long coords of Philadelphia (geo:lat, geo:long)
- A list of famous people who were born in Philadelphia (p:birthPlace)
----- I just learned Bob Saget was born here???
- A list of famous people who dies in Philadelphia (p:deathPlace)
(i.e. Ben Franklin)
- The population of the city (p:populationTotal)
- The "nicknames of the city (p:nickname) (i.e. "The City of Brotherly
Love", "Birthplace of America", etc.)
- The list goes on and on and on

The information you can automatically get from the URI representing
the concept of "Philadelphia" is absurd - there's no possible way of
jamming all that information as attributes into a specification. One
single entity (apml:Resource) to describe everything is much easier to
maintain - the APML agent can handle understanding what the resource
is once it pulls down the resource metadata.

Plus, I foresee some synergies with the Linking Open Data project
(http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/
LinkingOpenData) with the RDF approach, in that people's public
attention profiles become part of the larger web of linked data. But
that's an added discussion for another day.

In my opinion, these possibilities of what can be done with APML and
RDF are almost mind blogging - and more so why I feel BOTH APML-
Vanilla (or whatever you want to call it) and APML-RDF are needed!

Just some thoughts.

Cheers,

Tim

Paul Jones

unread,
Aug 9, 2008, 6:24:15 AM8/9/08
to apml-...@googlegroups.com
Hi Tim,

Thanks for the thoughts - they are certainly something to mull over.

The idea of describing things simply as a linked URL is certainly exciting, but I have quite a number of substantial reservations. I'm also entirely opposed to two base-representations of APML. The splintering of the implementation base would be an incredibly bad thing.

As for my reservations:
  • I certainly understand that attempting to cram the definitions of everything into one spec is likely to be limiting. However, we certainly need to base ontology that can be worked from.
  • URLs allow for the re-use of more detailed information, but generating this data becomes substantially harder. For example, if you parse someone's lifestream, how do you know they are talking about the Philadelphia that is at the given dbpedia URL. How do you handle internal ontologies that don't have any defined resource?
  • I'm also concerned about the fact that relying soley on linked data introduces a substantial degree of fragility. Your APML profile becomes only as useful as the lifespan and validity of the associating linked data resources. Parsers also need to spend a substantial amount of time downloading external resources in order to parse - and as someone who has worked behind one too many corporate firewalls, I can tell you that parsers that download things tend to become very problematic.
I'm in no way against linking to other data, however, I do believe the base APML document should attempt to define common cases containing enough information for basic interpretation. Your location case is certainly a good one, in that there is so much you can indicate about a location - but I'd question the harm in actually having a specified Location element containing a base set of attributes (such as those proposed by Mason in a separate email) and the URL.

Paul.

Elias Bizannes

unread,
Aug 10, 2008, 6:05:12 AM8/10/08
to APML.Public.General
>    - I certainly understand that attempting to cram the definitions of
>    everything into one spec is likely to be limiting. However, we certainly
>    need to base ontology that can be worked from.
I think this "disfunction" is necessary. We need to protect the
competitive advantage of a company that has gone to the hard work of
generating and capturing this users attention. Sure - a user should
determine how that data about them is used, but a company that
generates it also has some shared custody of the data. And whilst it
should give up the data per the users requests, the exporting of the
data from their system severaly hinders their competitiveness.

Having concepts with a common ontology reduces a company's natural
advantage. If another company wants to utilise the concepts generated
from that original company - the burden is on them to associate the
different concepts generated by different systems. Every independent
system should have as their own base ontology - but a company should
have the ability of defining that ontology however they wish. It's a
market opportunity to be able to link these concepts from different
systems. Everyone can get the APML file, but the company that has the
most relationships with APML generators, is more valuable - because
they create stronger associations between concepts that seem different
but are actually similar.

>    - URLs allow for the re-use of more detailed information, but generating
>    this data becomes substantially harder. For example, if you parse someone's
>    lifestream, how do you know they are talking about the Philadelphia that is
>    at the given dbpedia URL. How do you handle internal ontologies that don't
>    have any defined resource?
The company that generated it knows. That's all that matters. It's up
to the user to associate that concept linked to that URL to another
one. So yes, a company is require to have their own base ontology, but
they can store it wherever they want.

>    - I'm also concerned about the fact that relying soley on linked data
>    introduces a substantial degree of fragility. Your APML profile becomes only
>    as useful as the lifespan and validity of the associating linked data
>    resources. Parsers also need to spend a substantial amount of time
>    downloading external resources in order to parse - and as someone who has
>    worked behind one too many corporate firewalls, I can tell you that parsers
>    that download things tend to become very problematic.
...and? Doesn't an APML file constantly regenerate itself, so that
only the most recent, relevant concepts come up? An APML file has a
fourth dimention in time - if a URL breaks, so be it. That APML file
has relevance at the time of generation, and as time passes, so does
it's value. The breaking on a link, is an example of how only the most
recently generated attention data has value.

As for the comment about parsers, I draw that back to my point about
market opportunity. The company that can do this cheaper, faster,
better - wins. We are all equal with the core spec, but some can be
more equal than others.

gdupont

unread,
Aug 11, 2008, 4:48:19 AM8/11/08
to APML.Public.General
I'm a bit confuse here... We are talking about "internal hidden
privately held ontologies" in order to keep the competitiveness of
company that generate APML. Don't you think that company in such mood
will provide un-explicit concepts with proprietary URI in order to
keep their advantages ? Thus you will be able to export something but
you will never be able to do something with it without the generator
agreement because of this hidden concept...

I liked the idea of links to define entities (location or people or
whatever you need to define attention data) but I foresee what Paul
talk about when speaking about parsers that need to download something
to work on APML. Thus, IMO you should always have a simple definition
(using keys) and optional extended one. But keep it optional please
and never make encrypted (or non usable) keys just to keep your
business alive.

I remember to have proposed something like embedded RDF which could be
an APML section dedicated to the definition of RDF resources used in
attention data definition. What do you think about it ? In my personal
experiences on data exchange model, hybrid models has been a good
start point to make people work on RDF without diving them into
semantic.

Back to the location thing. I see that location could be very very
usable since it is at least one half of personalisation in information
presentation. But saying that, should the location be linked to a data
(implicit or explicit) as stamp as suggested by Elias Bizannes or to a
profile saying that my office profile is linked to that location and
my home is... at home ;-) ? Not sure that's useful, at least most of
search engines do such thing using IP location.

Well, we also have to think to broader location : localisation of the
user, Locale of its system (preferred languages) and then location of
concepts involved... As far as I understand, the actual location is
related to the last one. Don't you think the two others could be
useful ? Where to hang them ? On profile level ?

Finally, geo location is not something obvious. You can define a point
(x,y) with geo extension (lat and long) or a place (name or RDF URI)
but also an area which can be seen as a location restriction (right
river side of Rouen where you can find the hottest bars). Again here,
there is some awesome ontologies to do the job and I'll be able to
provide some (as soon as my colleague, expert in geo things, come back
from vacation ;-) ). At least, we should proposed the easy way [ x, y
(or lat/long or any standard ?) and radius to define a simple area or
fuzziness of localisation ] and an extended version using URI.

gd

TSchultz55

unread,
Aug 11, 2008, 9:34:45 AM8/11/08
to APML.Public.General
> but I foresee what Paul
> talk about when speaking about parsers that need to download something
> to work on APML

Isn't that partly what DataPortability is all about, thought? REUSE
of pre-existing data! DBPedia already has all the geospatial
information that is needed.

Either way - for an APML agent to insert geospatial information about
a "Location" means that the APML agent itself will be reliant on some
type of service to acquire the information. 1,000 APML agents will do
it 1,000 different ways.

We're left with redundancy and quite possibly disparate data.

No matter how you look at it, an APML agent will most likely be
reliant on downloading SOME sort of information from a third-part
service.

And when dealing with "plain-text" tags to get this type of
information, you get (somewhat comical) mix-ups like this:
http://ramandwhiskey.com/2008/08/11/semantic-fail/

Plain-text "Location" concepts and retrieving their corresponding
geospatial information has a bunch of issue that would need to be
addressed.

Cheers,

Tim
Reply all
Reply to author
Forward
0 new messages