Re: [DataPortability-Public] DataPortability and SIOC

3 views
Skip to first unread message

John Breslin

unread,
Jan 7, 2008, 5:02:51 AM1/7/08
to dataportabi...@googlegroups.com, sioc...@googlegroups.com
Hi Chris -

Thanks for creating this discussion, I hope it will be useful.

I've been reading up on YADIS [1] / XRDS [2] / WRFS [3] to make sure
that I can get an idea of their potential, as I am not au fait with them
I'm sad to say.

So as I understand, you use YADIS to discover an identity, and get an
XRDS document back indicating which identities they prefer to use and
what services those identities are on. Then you can use WRFS to find out
what containers those identities hold on those services.

SIOC [4] is one representation method for describing the content of the
containers and the items, and the structure / connections therein.

For example, on the DP wiki there's an illustration [5] by Josh
Patterson showing how a WRFS prototype workspace could use YADIS+WRFS to
get URIs for an identity's associated data containers so that
applications can access the data in those containers represented in a
format like SIOC or FOAF (for containers of posted items or people
respectively). As Alexandre Passant mentioned earlier, FOAF and SIOC
are being used in his application to export linked data from Flickr
accounts [6]. SIOC data is also being produced from various personal
blogging platforms and microblogging accounts [7].

But SIOC isn't just for personal containers of data. I think another
issue for the DataPortability workgroup is whether methods can be used
to port not just personal sets of data but communities of data. SIOC
was initially intended to provide a way to describe the content from
online communities, like mailing lists, message boards, etc. It was
soon used for people's blogs (since the post+reply structure is very
similar to community discussions; it's just that the first poster is
usually one person in blogs), and more recently for other personal sets
of Web 2.0-type content items. But if I run a community site, and I
decide I want to port my group from one place to another, SIOC can be
used to fully describe the structure (and content if combined with other
vocabluaries) of most communities. We have various exporters in place;
importers are the next step (a first demonstrator for WordPress has been
produced [8]).

What do others think? I know SIOC is just one representation format of
course, microformats can be used and Eran Globen has shown how SIOC-type
structures can be represented using microformats [9] as well.

Thanks,

John.
--
[1] http://yadis.org/
[2] http://en.wikipedia.org/wiki/Yadis#Yadis_capability_document
[3] http://cowbell.floe.tv/WRFS_11_20_2007.html
[4] http://sioc-project.org
[5] http://dataportability.pbwiki.com/WRFS%20Prototype%20Workspace
[6]
http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-with-foaf-and-sioc/
[7] http://rdfs.org/sioc/applications
[8] http://wiki.sioc-project.org/w/SIOC_Import_Plugin
[9]
http://web.archive.org/web/20061103031603/http://hellonline.com/blog/?p=91

-----Original Message-----
From: dataportabi...@googlegroups.com on behalf of Chris Saad
Sent: Fri 04/01/2008 23:12
To: DataPortability.Public.General
Subject: [DataPortability-Public] DataPortability and SIOC

Hi John,

I thought we should start a new thread over here to discuss SIOC and
how it fits into the broader picture of DataPortability.

Let's start with the very basic principals.

Let's assume that using the DataPortability reference design (for
example) I am able to find a user's YADIS/XRDS information, and from
there identify the services/data containers they use, and from their
use oAuth to log into those services, and from there use some sort of
standardized WRFS query API to retrieve personal data from that user's
account.

What does SIOC give me in this picture that I don't have already?

Perhaps there is some overlap (and that's ok - maybe some of us need
to adjust the picture).

This is the goal of DP - not work out how this all fits together to
form a complete picture and write down the reference design - so this
is a very worthy discussion.

Look forward to your input.

Chris

JP

unread,
Jan 17, 2008, 1:38:34 PM1/17/08
to SIOC-Dev


On Jan 7, 5:02 am, John Breslin <john.bres...@deri.org> wrote:
> Hi Chris -
>
> Thanks for creating this discussion, I hope it will be useful.
>
> I've been reading up on YADIS [1] / XRDS [2] / WRFS [3] to make sure
> that I can get an idea of their potential, as I am not au fait with them
> I'm sad to say.
>

Thats cool. I totally respect that. Hopefully, over time we'll find
more common ground. I'm working a lot with Danny Ayers in the official
"WRFS" group [1], which is now spun off to be its own tech just like
OAuth and openID. If you would like to join and express these
sentiments, then by all means, join up! it can only make these
communities stronger in the long run.

> So as I understand, you use YADIS to discover an identity, and get an
> XRDS document back indicating which identities they prefer to use and
> what services those identities are on. Then you can use WRFS to find out
> what containers those identities hold on those services.
>

WRFS is / will be really just a standardized protocol to manage
access, updating, and discovery of really RDF, legacy data, or even
SIOC data. I've read many semantic web books, and actually am a big
fan (I could see us using DAML for describing legacy apis like flickr,
for instance). I think we need to let the dust settle, and just find
more common ground. I read the SIOC documents way before i wrote [3]
initially.

> SIOC [4] is one representation method for describing the content of the
> containers and the items, and the structure / connections therein.
>

Yup. and WRFS could be a run-time transport layer for a startup to use
that aggregated "message board post" data in say a desktop
application, or a google android device, or even serve as a logical
file system for a mobile device. SIOC is actually one of the things
that inspired my initial concept draft.

> For example, on the DP wiki there's an illustration [5] by Josh
> Patterson showing how a WRFS prototype workspace could use YADIS+WRFS to
> get URIs for an identity's associated data containers so that
> applications can access the data in those containers represented in a
> format like SIOC or FOAF (for containers of posted items or people
> respectively). As Alexandre Passant mentioned earlier, FOAF and SIOC
> are being used in his application to export linked data from Flickr
> accounts [6]. SIOC data is also being produced from various personal
> blogging platforms and microblogging accounts [7].
>

There again, I see SIOC as a tremendously useful technology. I also
view WRFS as a sort of TCP entity to SIOC's HTTP entity (this is a
loose metaphor) --- in that WRFS could be the "transport" layer of
data aggregation on theweb, and something like RDF or SIOC could be
the linking layer, or the presenation layer. One of the main things I
focused on was working towards a transport model of web data, and just
adding "glue" in places that might help it along.

> But SIOC isn't just for personal containers of data. I think another
> issue for the DataPortability workgroup is whether methods can be used
> to port not just personal sets of data but communities of data. SIOC
> was initially intended to provide a way to describe the content from
> online communities, like mailing lists, message boards, etc. It was
> soon used for people's blogs (since the post+reply structure is very
> similar to community discussions; it's just that the first poster is
> usually one person in blogs), and more recently for other personal sets
> of Web 2.0-type content items. But if I run a community site, and I
> decide I want to port my group from one place to another, SIOC can be
> used to fully describe the structure (and content if combined with other
> vocabluaries) of most communities. We have various exporters in place;
> importers are the next step (a first demonstrator for WordPress has been
> produced [8]).
>

Yes, I agree again. And just like TCP isnt that concerned with whether
its transporting a HTTP GET request or a HTTP 200 OK response, WRFS
can transport SIOC, etc. Or at least it could, we're actually just
building a prototype now with minimalist structure, to show off a
concept demo. No specs, no hard lines in the sand, just working code.

> What do others think? I know SIOC is just one representation format of
> course, microformats can be used and Eran Globen has shown how SIOC-type
> structures can be represented using microformats [9] as well.
>
> Thanks,
>
> John.
> --
> [1]http://yadis.org/
> [2]http://en.wikipedia.org/wiki/Yadis#Yadis_capability_document
> [3]http://cowbell.floe.tv/WRFS_11_20_2007.html
> [4]http://sioc-project.org
> [5]http://dataportability.pbwiki.com/WRFS%20Prototype%20Workspace
> [6]http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-wit...
> [9]http://web.archive.org/web/20061103031603/http://hellonline.com/blog/...
>
> -----Original Message-----
> From: dataportabi...@googlegroups.com on behalf of Chris Saad
> Sent: Fri 04/01/2008 23:12
> To: DataPortability.Public.General
> Subject: [DataPortability-Public] DataPortability and SIOC
>
> Hi John,
>
> I thought we should start a new thread over here to discuss SIOC and
> how it fits into the broader picture of DataPortability.
>
> Let's start with the very basic principals.
>
> Let's assume that using the DataPortability reference design (for
> example) I am able to find a user's YADIS/XRDS information, and from
> there identify the services/data containers they use, and from their
> use oAuth to log into those services, and from there use some sort of
> standardized WRFS query API to retrieve personal data from that user's
> account.
>
> What does SIOC give me in this picture that I don't have already?
>
> Perhaps there is some overlap (and that's ok - maybe some of us need
> to adjust the picture).
>
> This is the goal of DP - not work out how this all fits together to
> form a complete picture and write down the reference design - so this
> is a very worthy discussion.
>
> Look forward to your input.
>
> Chris

I think the major thrust of WRFS is simply this: there is a lot of
semantic web tech out there. Startups are not yet using it, and I
think the general issue is complexity; HTML was a basic markup
language that most code-monkeys could jump on pretty quickly. However,
the basics of RDF involve the "black magics" of relational algebra. To
you and I, thats not too bad, we had that before we had basic comp sci
classes, or grad school. To "Average Joe Hacker", inferencing really
is "black magic".

In the same way that TCP made it easier for HTTP, and we dont go re-
implementing TCP/IP stacks that much anymore, I'd like to see WRFS
become a transport, discovery, and aggregation mechanic for web data.
RDF has a role to play, as well as other interesting technologies.

I think once we take some of the complexities, and put them "under the
hood" of a tech "stack" like WRFS (or whatever is to come), and make
it simple and turnkey with pre-made open source libraries, so a garage
startup can "get on The Graph" [2] very simply, and their data is/can
be being pushed into The Graph as relational triples, then the
semantic web vision can grow and become that ultra killer app is has
the potential to be. But until we get the necessary critical mass to
generate those higher level effects of "more than the sum of its
parts", the semantic web will be critically panned as "vapourware". I
just would be very humbled if WRFS could be a small part of bringing
the semantic web about by making it accessible to the average hacker.
This may or may not work out the way I've planned, but thats why we
play the game.

Josh Patterson
floe.tv

[1] http://groups.google.com/group/wrfs
[2] http://dig.csail.mit.edu/breadcrumbs/node/215

John Breslin

unread,
Jan 25, 2008, 11:50:18 AM1/25/08
to sioc...@googlegroups.com
Hi Josh -

Great reply, thanks!

> I think the major thrust of WRFS is simply this: there is a lot of
> semantic web tech out there. Startups are not yet using it, and I

...


> classes, or grad school. To "Average Joe Hacker", inferencing really
> is "black magic".

Unfortunately as a non-practising magician, I'm not too well up on my
inferencing spells :) SWEO [1] is currently nearing the end of its
charter, and has focused a lot on practical use cases, but we also need
to get more of the "Semantic Web for Dummies" stuff out there [1] for
average Joe.

> In the same way that TCP made it easier for HTTP, and we dont go re-
> implementing TCP/IP stacks that much anymore, I'd like to see WRFS
> become a transport, discovery, and aggregation mechanic for web data.
> RDF has a role to play, as well as other interesting technologies.

Do you think SPARQL should or could be part of a WRFS transport layer?

I also wanted to ask if there is something that you think we may be
missing in SIOC that could be added to help with the WRFS effort. I
know you mentioned in [2] that SIOC didn't solve some issues and some
parts of the (RDF) puzzle were loosely joined - what I think you meant
is that WRFS was created to bridge some of the obvious gaps? But if we
are missing something obvious in SIOC, let us know!

Thanks again, I have found your contributions and example scenarios very
illustrative and practical so far - great work.

John.
--
[1] web.media.mit.edu/~stefanm/commonsense/SemanticWeb.ppt

XML Customized tags, like:
<dog>Nena</dog>
+ RDF Relations, in triples, like:
(Nena) (is_dog_of) (Kimiko/Stefan)
+ Ontologies Hierarchies of concepts, like
mammal -> canine -> Cotton de Tulear -> Nena
+ Inference rules Like:
If (person) (owns) (dog), then (person) (cares_for) (dog)

= Semantic Web!

[2]
http://groups.google.com/group/dataportability-public/browse_frm/thread/efa8071fd5ad3b65?

Josh Patterson

unread,
Jan 25, 2008, 3:21:50 PM1/25/08
to SIOC-Dev


On Jan 25, 11:50 am, John Breslin <john.bres...@deri.org> wrote:
> Hi Josh -
>
> Great reply, thanks!
>
> > I think the major thrust of WRFS is simply this: there is a lot of
> > semantic web tech out there. Startups are not yet using it, and I
> ...
> > classes, or grad school. To "Average Joe Hacker", inferencing really
> > is "black magic".
>
> Unfortunately as a non-practising magician, I'm not too well up on my
> inferencing spells :) SWEO [1] is currently nearing the end of its
> charter, and has focused a lot on practical use cases, but we also need
> to get more of the "Semantic Web for Dummies" stuff out there [1] for
> average Joe.
>

I do believe one day that the semantic web will go mainstream, i just
think it will go mainstream in the way that TCP is main stream, or
HTTP is main stream --- under the hood, employed by the masses via
libraries, but enjoyed by all.

And really, thats the basis of WRFS --- get data to people how they
want it, make the web's data come to the user, as opposed to making
the user jump through { facebook's, myspace's, google's } hoops and
terms to get their data.

> > In the same way that TCP made it easier for HTTP, and we dont go re-
> > implementing TCP/IP stacks that much anymore, I'd like to see WRFS
> > become a transport, discovery, and aggregation mechanic for web data.
> > RDF has a role to play, as well as other interesting technologies.
>
> Do you think SPARQL should or could be part of a WRFS transport layer?
>

Definitely. me and Zef have talked about that. with a traditional
database system, you take a query string, parse that into a query
tree, and then execute that against the indexes starting at the bottom
left for the joins. Why cant we do something like that with the web as
a whole (completely distributed --- I appreciate some of these single
store RDF solutions, but I'm more interested in the internet
abstraction as a single disk / db hybrid)? 2 things seem to
continuously pop up on the internet

1. Data Stores { myspace, facebook, flickr, hotmail }
2. Index of Data Stores { google, digg, slashdot }

A major part of WRFS is the "modeled" wNode (wNode being analogous to
the tradtional FS index, inode, or a traditional DB index) entity; A
personal index service, that is a RESTful web resource, that allows
for access and updating via OAuth for security. Sorta like FOAF, but
with a RESTful interface to allow 3rd parties to update it (say flickr
PUTs an entry for you that says "John has data with flickr", so that a
third party app could later find and use your flickr images for you).

That brings us back to SPARQL; what if we took a sparql query, parsed
it, and then queried the wNode web resource for Data Containers of a
type (backed up by media type ontology) image, as well as some other
parameters. It could be queried with multiple openIDs (and return the
results as FOAF? RDF? XRDS?), which would hit each wNode resource
pointed at, respectively, negotiate the permissions automatically to
access the resources, and then began aggregating the results and
passing them back up to the application layer.

The focus of WRFS is controlled identity access, a DNS for identity.
It doesnt try and replace RDF or FOAF or OWL --- really it takes the
linked data principle, and implements a version of it aimed at a
specific task --- a personal web resource that indexes our web data
for us in a controlled way, and negotiates access for 3rd parties via
OAuth + Discovery.


> I also wanted to ask if there is something that you think we may be
> missing in SIOC that could be added to help with the WRFS effort. I
> know you mentioned in [2] that SIOC didn't solve some issues and some
> parts of the (RDF) puzzle were loosely joined - what I think you meant
> is that WRFS was created to bridge some of the obvious gaps? But if we
> are missing something obvious in SIOC, let us know!
>


Now, the FOAF principles are very interesting, and initially I was
pushing our startup project to use it. However, the more I looked at
how openID and OAuth were having success, I knew that control of the
FOAF file was going to be an issue, and I've yet to find a
standardized way to control access and update to a FOAF file (some
people have responded that "oh, well, you could just do *this* or
*that*, but i still havent seen a standard way to do it).

So lets think about it for a minute; Really, a FOAF file is a set of
records in a database, or a set of triples / assertions, about a
resource, and then they get written out by a web server { php,
asp.net, perl } into the FOAF format. So really, if you are generating
a FOAF file from your webserver, then you already have a "wNode", at
least partially. I think the next phase is to add security features,
and the ability to grant a third party acccess to only portions of
your FOAF file, your wNode data, and thats where OAuth comes in.

So really -- we are talking about the same thing. What I meant about
the "gaps" was the security and updating protocol, taking that,
putting that into a spec, and making it accessible to the hackers at
large, the projects like dp.org, projects like Chris Messina's DiSo.
So now could also start controlling at the Data Container level who
had access to our SIOC data, and allow WRFS, the wNode, and OAuth to
find and authorize access to that information on our behalf.

I think really, at this point, the various data groups are finding a
common "ontology", if you will --- a common way to relate, and
figuring out ways to work together. I see a future for The Graph (web
of data, TBL) that involves decentralized discovery of RDF, OWL, ATOM,
SIOC, etc, and WRFS is just a "standardized mechanism" that uses some
open identity and authorization protocols to bring linked data to life
and hopefully help realize its full potential.

Josh Patterson


> Thanks again, I have found your contributions and example scenarios very
> illustrative and practical so far - great work.
>
> John.
> --
> [1] web.media.mit.edu/~stefanm/commonsense/SemanticWeb.ppt
>
> XML Customized tags, like:
> <dog>Nena</dog>
> + RDF Relations, in triples, like:
> (Nena) (is_dog_of) (Kimiko/Stefan)
> + Ontologies Hierarchies of concepts, like
> mammal -> canine -> Cotton de Tulear -> Nena
> + Inference rules Like:
> If (person) (owns) (dog), then (person) (cares_for) (dog)
>
> = Semantic Web!
>
> [2]http://groups.google.com/group/dataportability-public/browse_frm/thre...
Reply all
Reply to author
Forward
0 new messages