Re: LLUP Blips, and specifically how to capture and effectively describe keywords

3 views
Skip to first unread message

M. David Peterson

unread,
Apr 18, 2007, 2:57:00 AM4/18/07
to llup
Oh, good point!  Done. :D

On 4/18/07, Sylvain Hellegouarch <s...@defuze.org> wrote:

If you folks don't mind I would prefer if we could moce this thread to
the LLUP google group so that everyone can participate.

- Sylvain

Russell Miles wrote:
> Absolutely agree. What I'm trying to do is find a balance between the
> specification and intent of a Blip (what its contents should be and why,
> that's the area the spec needs to be completed to cover).
>
> But after that comes the question of implementation and that's what the
> schema is all about. In the case of the early LLUP service implementations,
> and I should stress that it is just in those implementations, we need to
> nail down the representation of a BLIP within the infrastructure of LLUP,
>
> So there's definitely some work to do on the spec but, like a lot of higher
> level specs, it's really going to benefit from an (example) reference
> implementation (plus that helps us tie those loose ends :)) as then the
> delving questions get asked, such as how we should think of an potentially
> express keywords - or at least what guidelines should we give when it comes
> to keywords (strong semantic ontology, loose definition, or somewhere in
> between).
>
> Does this plan sound good? And does anyone have some experience with
> keyword/semantics definitions that we could potentially apply at the spec
> and example ref implementation level?
>
>
> On 4/18/07, M. David Peterson <xmlh...@gmail.com> wrote:
>> Hey Eric,
>>
>> As per the original driving focus of LLUP,
>>
>> LLUP is a notification system that can (but is not required) to use an XML-based
>>> element and attribute structured format that has been predefined to include:
>>
>> While XML is obviously a format that we want to embrace, I believe both
>> JSON and even CSV given LLUP's generally flat file format make a lot of
>> sense to include as part of the specification.  Ultimately we are working
>> with name/value pairs, and name/value pairs are easy to describe in a
>> plethora of formats, so binding ourselves to XML doesn't make a whole lot of
>> sense, though I do believe it should be the driving focus that defines how
>> the other formats are structured.
>>
>> Russ: I guess we haven't really pushed too hard on this portion of the
>> spec.  I think it would be a good idea to put some thought into.  What do
>> you think?
>>
>>
>> On 4/17/07, Eric Larson <er...@ionrock.org> wrote:
>>>
>>> I kind of assumed this was the case, but I figured I'd ask just to be
>>> sure.
>>>
>>> Thanks!
>>>
>>> Eric
>>>
>>> On 4/17/07, Russell Miles <aoha...@gmail.com> wrote:
>>>> Nice question and had thought of that :) What we'd do is place on the
>>>> boundaries the translation of a BLIP into the format of choice for the
>>>> device (SMS, MMS, VMS, VML, XHTMP-MP, WML or whatever) and the
>>> retrieval
>>>> method (push, pull, etc.) so although Blips would look like we intend
>>> within
>>>> the core LLUP system, as soon as we hit the devices certain
>>> translation will
>>>> have to occur (unless the device has the ability to do the translation
>>>> necessary on it- such as client-side XSLT).
>>>>
>>>> This is a pretty standard procedure that is adopted with a number of
>>> other
>>>> protocols, including recent attempts to access feeds from mobile
>>> devices
>>>> (such as ATOM and RSS). There is always a toss-up between client side
>>> grunt
>>>> (in order to do the processing) and server side grunt (in getting the
>>>> translatin to occur there) and the dependency is complicated a little
>>>> further by transport. However, with Blips as the common starting
>>> currency
>>>> then we at least have something to convert from and then we can deal
>>> with
>>>> the edge cases as they become most prominent (such as SMS/MMS for
>>> local
>>>> notifications at a particular gig).
>>>>
>>>> Does this sound about right? Or is there something in there that I'm
>>>> missing?
>>>>
>>>> Cheers,
>>>> Russ
>>>>
>>>>
>>>> On 4/17/07, Eric Larson < er...@ionrock.org> wrote:
>>>>> Regarding message size, does the sample on
>>>>>
>>>>> http://www.x2x2x.org/projects/wiki/doku.php?id=llup
>>>>>
>>>>> take into account character limits on mobile devices like cell
>>> phones?
>>>>> I am not sure if this is even a valid concern, but just in case
>>>>> enclosing something like above example in an SMS message would be
>>>>> problematic I am brining it up. For example if we wanted to be
>>> central
>>>>> in the SXSW party scene via access through LLUP and cell phones,
>>> then
>>>>> this kind issue could be important.
>>>>>
>>>>> Please feel free to ignore this line of thought if I am
>>>>> misunderstanding where this protocol would need to be implemented in
>>>>> regard to devices.
>>>>>
>>>>> Eric
>>>>>
>>>>> On 4/17/07, Russell Miles <aoha...@gmail.com > wrote:
>>>>>> Hi everyone,
>>>>>>
>>>>>> Me and Sylvain are working on some loose ends by implementing LLUP
>>>> between
>>>>>> our CMS and some of the early front end services. If you've not
>>> heard of
>>>>>> LLUP before, check out the resources at the bottom of this email.
>>>>>>
>>>>>> Anyway, we are attempting to define the format for a LLUP Blip,
>>> which is
>>>> the
>>>>>> discrete packet of information that encapsulates a notification of
>>> a
>>>> larger
>>>>>> resource by its link element. The Blip is nothing more than a
>>> piece of
>>>> XML,
>>>>>> and the idea is that it can then be transported across any number
>>> of
>>>> systems
>>>>>> (transport-agnostic for the techies) from one LLUP node to
>>> another,
>>>>>> eventually reaching the systems or people that have expressed an
>>>> interest in
>>>>>> receiving blips of that type.
>>>>>>
>>>>>> Please take a look at the schema and example blip XML instances
>>> attached
>>>> and
>>>>>> let us know what you think. Have we forgotten anything that you
>>> can
>>>> think
>>>>>> of, or are we assuming something we shouldn't.
>>>>>>
>>>>>> Particularly we need some feedback on how to accurately specify
>>> the
>>>> keywords
>>>>>> that are attached to a blip (probably Uche's domain, but everyone
>>> is
>>>> more
>>>>>> than welcome to weigh in with an opinion :))  I'm no expert on
>>>> approaches to
>>>>>> taxonomies, but what we need ideally is some way to specify
>>> keywords
>>>> that
>>>>>> can be made sense of anything from a 'dumb' LLUP node (one that
>>> merely
>>>> does
>>>>>> a keyword search could perhaps fall into this category) and a
>>> smart node
>>>>>> (one that does higher level semantic inferencing, or even a human
>>> set of
>>>>>> filters!).
>>>>>>
>>>>>> Any help will be greatly appreciated, and credited for of course.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Russ
>>>>>>
>>>>>>
>>>>>> Resources:
>>>>>>
>>>>>>
>>>>>> The LLUP spec wiki -
>>>>>> http://www.x2x2x.org/projects/wiki/doku.php
>>>>>> My blog entry on LLUP working in the Email domain -
>>>>>>
>>>> http://www.russmiles.com/home/2006/10/10/does-this-sound-like-llup-or-what.html
>>>>>> A presentation that attempts to show LLUP in action across a set
>>> of
>>>> nodes. -
>>>> http://www.soaranch.com/storage/presentations/llup-architecture.pdf
>>>>
>>>>  >
>>>>
>>>
>>>
>>>
>>
>> --
>> /M:D
>>
>> M. David Peterson
>> http://mdavid.name | http://www.oreillynet.com/pub/au/2354 |
>> http://dev.aol.com/blog/3155
>
> >
>







--
/M:D

M. David Peterson
http://mdavid.name | http://www.oreillynet.com/pub/au/2354 | http://dev.aol.com/blog/3155

M. David Peterson

unread,
Apr 18, 2007, 3:09:34 AM4/18/07
to llup
Sounds perfect to me...  :D

Sylvain Hellegouarch

unread,
Apr 18, 2007, 3:39:03 AM4/18/07
to ll...@googlegroups.com
All,

We have started having a private discussion around LLUP lately and we
have realized we should discuss it here in public instead.

As any wanna-be protocol we often start by saying "see it would just
that easy" just to understand that there is no such thing as easiness in
the creation of a protocol.

I will re-use a short paragraph from Russ in that discussion that wll
remind you what we mean by LLUP and Blip.

> Anyway, we are attempting to define the format for a LLUP Blip, which
is the discrete packet of information that encapsulates a notification
of a larger resource by its link element. The Blip is nothing more than
a piece of XML, and the idea is that it can then be transported across
any number of systems (transport-agnostic for the techies) from one LLUP
node to another, eventually reaching the systems or people that have
expressed an interest in receiving blips of that type.

So defining LLUP is actually defining the blip messaging ecosystem.
Imagine a blip as the dot traveling on a radar and emitted by a ship or
submarine. It notifies anyone listening for it that it's there and alive.

The discussion we have now is around three points:

1. LLUP will be designed to be application protocol agnostic. We deeply
want LLUP to work in as many contexts as we can. This means that even
though XMPP is my favorite approach to develop and test LLUP now we
should ensure that we can integrate a node that will use Amazon SQS, or
pure HTTP, etc. The point of LLUP is to deliver a message, no matter
how. This should be feasible because what matters is that each node
output a valid blip message. If a node can read blips from a SQS queue,
it could emit a XMPP message to the next node containing the blip.

2. How to identify and validate the credentials of a blip sender
throughout the network. That's a difficult one I think. Currently any
blip must have a llup:sender element identifying it. But where Russ and
I are not uncertain is the meaning of that identifier.

3. Taxonomy and web semantic. Since each node of the network should be
able to filter as per what a subscriber requested, we need to provide
some meaning to a blip beyond the simple fact of heart beat. LLUP comes
with a llup:keyword element but where Russ and I were uncertain again is
whether or not we should constraint the semantic of that element or
allow for a more complex inference like the RDF model could provide.

In the end what matters to me is that we keep the specification simple
in order to allow for its rapid integration. But if by simplicity we
lock down its potential then we may fail as well.

The balance as Yoda would say.

Thoughts?
- Sylvain

M. David Peterson

unread,
Apr 18, 2007, 3:45:45 AM4/18/07
to ll...@googlegroups.com
Nice summary, Sylvain!  One thing: Unless there are good reasons to hang onto the keyword element, as per the overview spec from a while back,

Keyword(s) [I propose we change 'keywords/keyword' to singular top level Atom 1.0 element 'category']

Anybody feel that we should stick with llup:keyword instead of move to atom:category?

On 4/18/07, Sylvain Hellegouarch <s...@defuze.org> wrote:

Sylvain Hellegouarch

unread,
Apr 18, 2007, 3:53:19 AM4/18/07
to ll...@googlegroups.com
M. David Peterson wrote:
> Nice summary, Sylvain! One thing: Unless there are good reasons to hang
> onto the keyword element, as per the overview spec from a while back,
>
> Keyword(s) [I propose we change 'keywords/keyword' to singular top level
>> Atom 1.0 element 'category']
>>
>
> Anybody feel that we should stick with llup:keyword instead of move to
> atom:category?
>

I would say that we definitely need to look over at what the Atom folks
have defined.
However I have seen some people arguing that from a web semantic point
of view the category element was not always the easiest solution either.

I would be happy to hear from the web semantic pros here as I am more
than welcome to reuse the atom:category element.

- Sylvain

M. David Peterson

unread,
Apr 18, 2007, 4:02:58 AM4/18/07
to ll...@googlegroups.com
On 4/18/07, Sylvain Hellegouarch <s...@defuze.org> wrote:

>

I would say that we definitely need to look over at what the Atom folks
have defined.

I agree.  We need to reuse as much as we can.  Of course, we don't want to force things that don't make sense, so we need to be careful we don't push ourselves into using elements and attributes for the sake of reuse if they don't make at least some sort of obvious connection to the purpose of the element/attribute.

However I have seen some people arguing that from a web semantic point
of view the category element was not always the easiest solution either.

'keyword' does suggest something a little more "here's the key take away of the subject matter" but category doesn't not (double negative, I realize, but not sure how else to say this) make any sense what-so-ever, and given the obvious cross-over with blogging, it seems to me that it makes more sense to focus on what existing implementers of the Atom Syndication Format have already worked into their blogging tools, than attempt to redefine something so it sounds a little nicer to the semantic web aficionados ear.

I would be happy to hear from the web semantic pros here as I am more
than welcome to reuse the atom:category element.

I wonder if Danny (Ayers) or Uche are listening in?  Guys?
 

- Sylvain

Sylvain Hellegouarch

unread,
Apr 18, 2007, 4:09:15 AM4/18/07
to ll...@googlegroups.com
it seems to me that it makes more
> sense to focus on what existing implementers of the Atom Syndication Format
> have already worked into their blogging tools, than attempt to redefine
> something so it sounds a little nicer to the semantic web aficionados ear.

I agree. The context of the problem we had with Russ is what is the
target of llup:keyword?

It seemed to us that the main aim was helping nodes with the network to
decide how to route a blip message. This means that we need a syntax
that is usable in an automated way rather than a human readable keyword.

Hence, I think we agree both on the fact that the llup:keyword element
is more than merely a "fancy label for the end user to read". That's why
indeed we should pay attention to the web semantic crowd (without
falling in an overly complex RDF model of some sort) by providing the
right hooks for those guys to say "yeap if you define that type of
attribute to that element we can make something out of it in an
automated fashion".

- Sylvain

M. David Peterson

unread,
Apr 18, 2007, 4:21:29 AM4/18/07
to ll...@googlegroups.com
On 4/18/07, Sylvain Hellegouarch <s...@defuze.org> wrote:

I agree. The context of the problem we had with Russ is what is the
target of llup:keyword?

It seemed to us that the main aim was helping nodes with the network to
decide how to route a blip message. This means that we need a syntax
that is usable in an automated way rather than a human readable keyword.

I couldn't have  provided a more clearer definition if you asked  -- nay, begged me for one ;-) :D

Hence, I think we agree both on the fact that the llup:keyword element
is more than merely a "fancy label for the end user to read".

Yep!

That's why
indeed we should pay attention to the web semantic crowd (without
falling in an overly complex RDF model of some sort) by providing the
right hooks for those guys to say "yeap if you define that type of
attribute to that element we can make something out of it in an
automated fashion".

I completely agree this is the right direction, and with the scheme, label, and term attributes of atom:category I think we gain all of that in which you gain with a typical RPV triple, though the URI value of scheme (in this comparison, the resource portion of RPV) is obviously more of a generic "here is the generalized categorization model this term applies to ( i.e. the same word can obviously have more than one meaning, so the scheme would be the resource the machine used to map the term to its proper indexed location, whatever that might be) than it is the physical location of that resource on the web.  Of course, label can be provided as the more human readable version of what the term actually represents, so if that happens to be "keyword" the so be it: We've obviously killed three birds with one RPV-esque element/attribute combo provided so kindly by the good folks in the land of Atom.

Thoughts?

- Sylvain

http://www.oreillynet.com/pub/au/2354 | http://dev.aol.com/blog/3155

M. David Peterson

unread,
Apr 18, 2007, 4:47:49 AM4/18/07
to ll...@googlegroups.com
One additional thought: Something that occurred to me is that the same keyword/category can not only have more than one meaning/context, but more than one language e.g. Spanish, Italian, Portuguese (just to name a group of languages that have a similar foundation, but a completely different end result).  In cases such as this, the easiest way I can think of to differentiate these is to use two or more category elements, each with the same term but with a different scheme applied to each one to represent each scheme the term applies to.  So for example,

<category term="hola" scheme="http://example.org/espanol" />
<category term="hola" scheme=" http://example.org/greeting" />

In the cases where @label was applied, then we could use something similar to your typical order of precedence rules in which the final @label became the human readable label that should be applied to the term.

Thoughts?

On 4/18/07, M. David Peterson <xmlh...@gmail.com> wrote:

- Sylvain


--
/M:D

M. David Peterson

Sylvain Hellegouarch

unread,
Apr 18, 2007, 4:59:40 AM4/18/07
to ll...@googlegroups.com
M. David Peterson wrote:
> One additional thought: Something that occurred to me is that the same
> keyword/category can not only have more than one meaning/context, but more
> than one language e.g. Spanish, Italian, Portuguese (just to name a group of
> languages that have a similar foundation, but a completely different end
> result). In cases such as this, the easiest way I can think of to
> differentiate these is to use two or more category elements, each with the
> same term but with a different scheme applied to each one to represent each
> scheme the term applies to. So for example,
>
> <category term="hola" scheme="http://example.org/espanol" />
> <category term="hola" scheme="http://example.org/greeting" />
>
> In the cases where @label was applied, then we could use something similar
> to your typical order of precedence rules in which the final @label became
> the human readable label that should be applied to the term.
>

Well as far as I understand it, this is the purpose of the scheme+term
combination but I will admit that RFC4287 doesn't make it clear. So yes
the scheme would be the key in that example.

- Sylvain

M. David Peterson

unread,
Apr 18, 2007, 5:04:39 AM4/18/07
to ll...@googlegroups.com
Yeah, the RFC obviously leaves things "open to interpretation", doesn't it. ;-)

On 4/18/07, Sylvain Hellegouarch < s...@defuze.org> wrote:

Uche Ogbuji

unread,
Apr 29, 2007, 11:25:42 PM4/29/07
to ll...@googlegroups.com

Sorry it's taken me so long to chip in here. Every time I think it
can't get more hectic at work, work surprises me.

So I think in the general idea, Mark, you get the direction I was
heading. I'm not crazy about the specific example because I think
language might be something we specify using xml:lang, but definitely
scheme is the key.

Some Weblog communities have started creating category schemes, in
effect, technorati being a good example, with its tags system. We could
have:

<category term="portishead" scheme="http://technorati.com/tag/" />

This is great, except that the URL helps us narrow down the "universe"
for that tag, but doesn't really help with what I call semantic
transparency, which is basically auto-discovery of useful details using
the scheme URL. This is where I think personplacething.info comes in.
I'd like to host a tagging metadata site that includes a bit more
machine-readable data. In effect, RDF, but I might surprise you folks
by not advocating actual RDF/XML. I've long thought that nasty format
is a huge obstacle to adoption of the very sensible ideas that gave rise
to RDF. These days I prefer XML designed to be "harvested" for
metadata. My ideas are along the lines of:

http://www-128.ibm.com/developerworks/xml/library/x-think32.html

The basic idea: use proven syntactic tools such as XSLT and schemata to
express the semantic layer, so the Webmaster can use languages he has
already had to learn, and not have to adopt yet another. Let those of
us who care enough about semantics (and are more sophisticated about
syntax, anyway) accept the burden of semantic transparency. This is the
lesson RDF folks miss when trying to force-feed everyone RDF/XML. The
result, misguided riposte's such as Doctorow's "metacrap".

Anyway, I think we should 100% accept technorati scheme, but if someone
decided instead to use:

<category term="music/portishead" scheme="http://personplacething.info/tag/" />


They should gain some benefit. For one thing, look at the support for
context, which should help, e.g. disambiguate the music group from the
town.I would like ppt.info to have a smart system for rolling up unfiled
tags into contextualized tags using a mix of editorial control and
community aggregation. In general, I'd like the site to be a place
where folksonomy meets ontology. It should support sloppiness, while
having a mechanism to build rigor. I could always decide to buck the
system by creating

http://personplacething.info/user/uche/portishead


To mean a character in my science fiction novel I just named
"portishead". If my novel became popular and my instance was used
widely enough in tagging it would float into a more official category.

This is just preliminary thinking about ppt.info, and as you can see,
there is a lot more thinking to be done, but I hope you get the jist of
the potential. Using ppt.info tags, you could tag with

<category term="music/portishead" scheme="http://personplacething.info/tag/" />


And it would automatically also match the category

<category term="music/trip-hop" scheme="http://personplacething.info/tag/" />


This could be signalled by some instruction at

http://personplacething.info/tag/music/portishead


In general a lot opens up once you actually put the scheme URL to use.

I would definitely consider category to be for stuff that doesn't have
other good means for expression. E.g. I'd prefer geocoding was done
using a more direct means.

--
Uche Ogbuji http://uche.ogbuji.net
Linked-in profile: http://www.linkedin.com/in/ucheogbuji
Articles: http://uche.ogbuji.net/tech/publications/

M. David Peterson

unread,
Apr 30, 2007, 5:14:33 AM4/30/07
to ll...@googlegroups.com
Hey Uche,

This is *FANTASTIC*! I agree 100% with what you are suggesting in
regards to the direction we take ppt.info.

re: using xml:lang for the language element > doh! :D Wish I would
have thought about that one for two seconds longer before speaking,
but none-the-less, I'm glad to see our check and balance system
working the way it should. :D

I'll follow-up in the thread regarding getting you access to the
ppt.info machine next.

TO ALL: http://code.google.com/p/personplacething-info/ is where we
will be hosting the code and
http://groups.google.com/group/personplacething for the group
discussion related specifically to this project and the underlying
code base.

M. David Peterson

unread,
Apr 30, 2007, 5:15:58 AM4/30/07
to ll...@googlegroups.com
BTW > I tried to match the group name to the project name, but Google
wouldn't allow me to use the word "info" as part of the group name.

M. David Peterson

unread,
Apr 30, 2007, 12:46:27 PM4/30/07
to ll...@googlegroups.com
Quick Update: As per Uche's suggestion, I killed the personplacething group.  Will be better to keep the focus here on LLUP rather than try to keep track of multiple groups.  The code base will remain in the same place, however ( http://code.google.com/p/personplacething-info/ ).

M. David Peterson

unread,
May 1, 2007, 7:17:46 AM5/1/07
to ll...@googlegroups.com
via Danny Ayers @ http://dannyayers.com/2007/05/01/astoria--microsoft-approaches

For a while now they've been developing an entity-relationship data model (as I've remarked on here more than once - this stuff grew from the remnants of WinFS) plus query language(s). On the other hand, the web's a little hard to ignore. Join the two together, you have (codename) Astoria :

Astoria exposes "data services" that enable applications to access and manipulate data over regular HTTP connections, using URIs to identify pieces of information within the data service, and simple payload formats such as XML and JSON to represent the data exchanged between the client and the server.

What they've got is in many respects isomorphic to some of the Semantic Web technologies we all know and love. It's closest to the scruffy RDF as simple data model approach, I doubt you'll find the word ontology mentioned anywhere. This is all designed to mesh well with Microsoft's existing programming and data tools, and points of direct contact between Astoria and the W3C semweb specs are a little limited. It would be easy to see this as MS doing semweb in their own proprietary fashion.

Danny goes on to describe how he feels this is a lot better than what might appear at first site. 

Interesting stuff!  Seems there could very easily be some cross-over here in regards to the current topic at hand in this thread.


On 4/29/07, Uche Ogbuji <uc...@ogbuji.net> wrote:
Reply all
Reply to author
Forward
0 new messages