Similar thoughts

6 views
Skip to first unread message

harry

unread,
Dec 4, 2008, 3:16:35 PM12/4/08
to Open GUID Discussion
Hi all,

I've been thinking about a similar problem for a while now but I was
thinking about more than just identification though. A GUID can
uniquely identify an instance of a particular object but I also wanted
to be able to attach some more information to this GUID. The main
reason for this is disambiguation. There are other benefits and
complications to doing this. With just a GUID I have no way to
disambiguate this from another GUID unless I have more information.

This is not a fully formed idea so is likely to have some holes.

The method I looked at was to use OID's, GUID's and Gellish English:

At the following URL you can register an OID using a GUID
http://www.oid-info.com/cgi-bin/manage

These two links have more on OID's
http://www.iana.org/assignments/enterprise-numbers
http://www.alvestrand.no/objectid/

Gellish English:
http://en.wikipedia.org/wiki/Gellish_English


The following is just one method of obtaining data.

The following OID is an identifier for me and all numbers beneath it
are delegated to me.

1.3.6.1.4.1.32424

Using DNS we can do something like:

dig 1.3.6.1.4.1.32424.example.com

to get a list of origin servers for the information about that OID. Or
if you want you could pass the information directly using CNAME but I
think that would be pushing DNS too far. I have had this working using
PowerDNS. Personally I prefer getting the root server for that DNS
lookup and then using REST to query the server ie:

http://otherdomain.com/oid/1.3.6.1.4.1.32424/

or something similar and getting the information we have related to
that OID back. For the time being I am ignoring who decides who is the
origin for what etc. Taking this a bit further you can designate OID's
to have particular types ie

let 1.3.6.1.4.1.32424 == PFIX
PFIX.3 == /type/
PFIX.3.4 == /type/people/
PFIX.3.4.6 == /type/people/person/
PFIX.3.4.6.7 == /type/people/person/birth_date
PFIX.3.4.6.7.8 == /type/people/person/gender

The web page obtained at

http://otherdomain.com/oid/1.3.6.1.4.1.32424/

would have the equivalent of

1.3.6.1.4.1.32424.3.4.6 == 1/1/1970
1.3.6.1.4.1.32424.3.4.7 == “male”

or in a more RDF style

<http://www.example.com/rdf/people/Harry> -> gender -> male
<http://www.example.com/rdf/people/Harry> -> birth_date ->
1/1/1970

Using these values we can compare types between entities to see if the
two entities are the same or close enough to flag for further
processing. I do not expect this to be infallible but knowledge never
has been.

The reason for Gellish English is so we have a reference point for
what the OID's actually mean. For instance one problem I have always
had with RDF is that predicates that are the same can have different
representations ie:

Harry “Is married to” Jenny
Harry “Has Spouse” Jenny

and this is before we even get into different languages. Using Gellish
English as the reference point we can then assign OID's as predicates
or the Gellish equivalent and use something like.

Harry “Is married to” Jenny
Harry “Has Spouse” Jenny
Harry “PFIX..3.4.7.8.7” Jenny

The would all mean the same thing and the OID method is language
agnostic. Gellish English recognizes synonyms so “Is married to” ==
“Has Spouse”.

There are other ways you could make this system work or add to it and
there are quite a few combinations of the above that could be used to
do this but these are just a few of the thoughts I had on it.

I've got as far as downloading the freebase database (211 million
quads):

http://download.freebase.com/datadumps/

Example quads where you can see types in use:
http://download.freebase.com/datadumps/quad-sample.txt

I've converted it to an integer representation so its faster and was
about to start assigning types to OID's using Gellish English to see
what sort of problems I ran into. Adding PDNS to this would provide a
DNS lookup of the entire freebase database.

Thoughts appreciated!

Regards,
Harry

ja...@openguid.net

unread,
Dec 5, 2008, 8:00:24 PM12/5/08
to Open GUID Discussion
Hi Harry, welcome. These are definitely interesting ideas.

I agree with your sentiment of aiding disambiguation. I decided to
keep it simple and use a textual description and some semantic-free
tags to aid in this process. I would be open to adding more fields
for this purpose, though it is a slippery slope to create any sort of
classification scheme. Granted the text and tag fields are for human
use, it is possible for a reasoner to investigate the semantic nature
of entities declared as openguid:identical to help automate
disambiguation between people and companies, e.g.

Re: OIDs. I considered the use of OIDs at the outset, but was turned
off by the fairly complicated registration authority process. In the
end I don't think the particular format of the identifier matters, as
long as it is constant and open. Being able to use DNS as a
resolution implementation definitely has merits...I believe the OKKAM
project is using a similar model.

As far as using OIDs to encode attribution on entities, it is an
interesting proposal. For Open GUID, I absolutely don't want to start
modeling the details of a particular class such as Person. That said,
it is quite possible there will exist an Open GUID for "birth date",
that is specifically described as a person's date of birth. This
could then be used in RDF statements to provide such data.

Re: Gellish English. I guess it's somewhere between a formal OWL
description of an attribute and a textual comment. Like anything, it
would take adoption of that standard to have published data and
semantic web interpreters able to make sense of it. It would still be
unfairly biased towards English, but much is on the Internet today.

In any case, I'd be interested to hear what you learn from your
efforts.

Jason

On Dec 4, 1:16 pm, harry <harryjack...@gmail.com> wrote:
> Hi all,
>
> I've been thinking about a similar problem for a while now but I was
> thinking about more than just identification though. A GUID can
> uniquely identify an instance of a particular object but I also wanted
> to be able to attach some more information to this GUID. The main
> reason for this is disambiguation. There are other benefits and
> complications to doing this. With just a GUID I have no way to
> disambiguate this from another GUID unless I have more information.
>
> This is not a fully formed idea so is likely to have some holes.
>
> The method I looked at was to use OID's, GUID's and Gellish English:
>
> At the following URL you can register an OID using a GUIDhttp://www.oid-info.com/cgi-bin/manage
>
> These two links have more on OID'shttp://www.iana.org/assignments/enterprise-numbershttp://www.alvestrand.no/objectid/
Reply all
Reply to author
Forward
0 new messages