Introducing myself

3 views

Skip to first unread message

Alexandre Passant

unread,

Jan 5, 2008, 7:18:59 PM1/5/08

to dataportabi...@googlegroups.com

Hi all,

My name is Alexandre Passant, I'm a PhD student in the Semantic Web
and Social Software field, affiliated with the LaLIC laboratory
(Université Paris 4), and working at EDF R&D (a french energy company)
where I'm studying "Semantic Web 2.0" corporate information systems.
I'm working on SIOC, that John Breslin previously introduced on the
list, as a co-author of some documents (and tools) and so I'm mainly
interested in data portability from a Semantic Web point of view (I
recently wrote a flickr exporter to RDF (using mainly FOAF and SIOC
ontologies) [1]).
I really believe that SW technologies, especially FOAF and SIOC can
solve the social graph portability / merging / querying issues. Most
of my expirements / view about those topics are described on my blog
at [2].

Best,

Alex

[1] http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-with-foaf-and-sioc/
[2] http://apassant.net/blog/

Josh Patterson

unread,

Jan 6, 2008, 9:32:41 PM1/6/08

to DataPortability.Public.General

Alex,
Hi, I'm Josh Patterson and I work on the internal Data Portability
WRFS design group. I've followed SIOC quite a bit, and I believe it
has a role to play in the future of data portability. One of the major
engineering pushes I've been involved with is how to treat all of this
data as a logical whole (as well as linking it together) --- which is
not entirely new, many people are working on that. Quite a few people
are talking about this, but many are just say "use X format, its the
best!". Problem is, just exposing a format is only part of the
problem, and there are more mechanics at play, which OAuth +
Discovery, wNode, and the WRFS stack addresses. The WRFS design group
is dedicated to implementing basically a "data aggregation node"
called a "wNode" (analagous to a traditional FS inode) which we're
writing a spec for. It points to all the places a user has data (which
can be of a number of types and formats, we are more about discovery
and aggregation), and can be updated securely by "data
containers" (think "writing entries into your foaf file"). The other
part of WRFS involves aggregating data from multiple Data Containers
(think: flickr and photobucket) and allowing that composite recordset
be used in a third party application. We didnt quite feel SIOC solved
all of those issues, so thats a major reason why we created an
engineering group of coders (who are part of startups and have a real
need to solve these problems --- to compete!), but we always welcome
perspective (and strong coders). If you want, look me ( jpatteson @
floe.tv ) up and we can chat more, I am a major proponent of not only
solving our engineering issues, but making our design palatable to the
RDF crowd as well (there's a reason WRFS stands for Web *Relational*
File System). My ultimate goal is to be able to satisfy a number of
"political" sides in terms of formats, while unifying their data in a
OSI Network Model - like "stack" (called the WRFS stack) that is
completely decentralized and platform independent. I'm doing my best
to reach out to as many groups (openID, OAuth, RDF, etc) in terms of
making this a community effort and finding as much common ground as
possible, but also understand that sometimes we just have to push
forward and make tradeoffs to accomplish our design goals. I look
forward to hearing from you.

Josh Patterson

> [1]http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-wit...
> [2]http://apassant.net/blog/

Alexandre Passant

unread,

Jan 7, 2008, 5:52:29 AM1/7/08

to dataportabi...@googlegroups.com

Hi Josh,

On Jan 7, 2008 3:32 AM, Josh Patterson <jpatt...@floe.tv> wrote:
>
> Alex,
> Hi, I'm Josh Patterson and I work on the internal Data Portability
> WRFS design group. I've followed SIOC quite a bit, and I believe it
> has a role to play in the future of data portability.

We hope so :)

One of the major
> engineering pushes I've been involved with is how to treat all of this
> data as a logical whole (as well as linking it together) --- which is
> not entirely new, many people are working on that. Quite a few people
> are talking about this, but many are just say "use X format, its the
> best!". Problem is, just exposing a format is only part of the
> problem, and there are more mechanics at play, which OAuth +
> Discovery, wNode, and the WRFS stack addresses. The WRFS design group
> is dedicated to implementing basically a "data aggregation node"
> called a "wNode" (analagous to a traditional FS inode) which we're
> writing a spec for. It points to all the places a user has data (which
> can be of a number of types and formats, we are more about discovery
> and aggregation), and can be updated securely by "data
> containers" (think "writing entries into your foaf file"). The other
> part of WRFS involves aggregating data from multiple Data Containers
> (think: flickr and photobucket) and allowing that composite recordset
> be used in a third party application. We didnt quite feel SIOC solved
> all of those issues, so thats a major reason why we created an
> engineering group of coders (who are part of startups and have a real
> need to solve these problems --- to compete!), but we always welcome
> perspective (and strong coders).

If I understood well, what you want is to get that "wNode" linking to
all data and informations / networks of a given user so that we can
retrieve all informations about a user from it (as explained [1]).

I think those issues are covered with SIOC, FOAF, but more generally
by the nature of RDF and decentralized information management in the
SW field.
I have profiles on flickr, twitter, facebook ... Since there are RDF
exporters for those tools ([2] [3] [4]), I have one file for each
service, each one defining (at least) an URI for myself as a
foaf:Agent. Using SIOC, I can assign a sioc:User to those URIs (done
by the flickr exporter [2]), and link to the content I provided (see
John Breslin's post at [5]) as natively does the twitter exporter [3].
So, I got various sets of foaf:Person / sioc:User that represent my
online activity on those services (blog posts, pictures ...) and also
my network (using the foaf:knows property).

Then, the challenge is to get a unique entry point. This can be
achieved by defining a new URI for myself, which will be kind of a
"reference URI", and for which I say that this URI identifies the same
person than the one defined in my previously created profiles. Than
can be done using the owl:sameAs property which says that two
resources are identical, see [6]. So I have different URIs, but in a
way, they're all the same. Thus I'll have one entry point, linking to
all my profiles, networks, and data. I can query that entry point to
ask all pictures from flickr or recent twits, or my facebook friends.

Then, regarding how services can use it, I can have an OpenID page
that links to my FOAF profile and so to my main FOAF "reference URI",
and services should be able to retrieve it (see some of my experiments
on [7]), and then retrieve other informations about myself that was
created by external services (and maybe retrieve only one service
information, etc ...)

Here's how, I think, OpenID + FOAF + SIOC + owl:sameAs can be used to
solve some of the WRFS steps. If I understand the WRFS abstract draft
[8], what you want to achieve with it is defining a meta-level for
this scenario ? i.e. what I described is one implementation of a
openID + semweb way for WRFS ?

Hope that helps, and looking forward to continuing the discussion,

Best,

Alex.

[1] http://dataportability.pbwiki.com/WRFS%20Prototype%20Workspace
[2] http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-with-foaf-and-sioc/
[3] http://tools.opiumfield.com/twitter/terraces/rdf
[4] http://www.dcs.shef.ac.uk/~mrowe/foafgenerator.html
[5] http://www.johnbreslin.com/blog/2008/01/04/dataportabilityorg-web-standards-sioc-and-foaf/
[6] http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkedDataTutorial/
[7] http://apassant.net/blog/2007/09/23/retrieving-foaf-profile-from-openid/
[8] http://cowbell.floe.tv/WRFS_11_20_2007.html

Zef Hemel

unread,

Jan 7, 2008, 6:33:22 AM1/7/08

to dataportabi...@googlegroups.com

Hi Alex,

I think the SIOC work can be of great help with the meta-data aspect
of data portability. One issue of WRFS it does not address (I think)
is a standardized interface to actually get access to this data.
Retrieval of data is kind of obvious (HTTP GETs), although, how does
it deal with privacy? I might want to give a certain app access to
some of my private photos on flickr, does SIOC deal with that? Is
there a notion of access control? And then of course there's the
actual data manipulation (PUT and DELETE) of data. These are the
actual storage needs of WRFS, but for quite a bit of the rest we might
be able to use quite some existing RDF/OWL vocabularies.

I must say that you're talking me back into the idea of using RDF as a
means to represent relationships between data items. This is how I
started off thinking about (and in fact implementing) WebFS. Later I
thought it might be better to put data in Atom collections and
represent the meta data in atom feeds and use the AtomPub protocol
[1], because you get so much for free (all the PUT, POST, GET and
DELETE stuff is already specified). But now I'm not so sure.

Interesting.

Zef

[1]: http://tools.ietf.org/html/rfc5023

--
Zef Hemel
E-Mail: z...@zefhemel.com
Phone: (+31) (0)6 156 19 280
Web: http://www.zefhemel.com

Alexandre Passant

unread,

Jan 8, 2008, 3:09:59 AM1/8/08

to dataportabi...@googlegroups.com

Hi,

On Jan 7, 2008 12:33 PM, Zef Hemel <zefh...@gmail.com> wrote:
>
> Hi Alex,
>
> I think the SIOC work can be of great help with the meta-data aspect
> of data portability. One issue of WRFS it does not address (I think)
> is a standardized interface to actually get access to this data.
> Retrieval of data is kind of obvious (HTTP GETs), although, how does
> it deal with privacy? I might want to give a certain app access to
> some of my private photos on flickr, does SIOC deal with that? Is
> there a notion of access control? And then of course there's the
> actual data manipulation (PUT and DELETE) of data. These are the
> actual storage needs of WRFS, but for quite a bit of the rest we might
> be able to use quite some existing RDF/OWL vocabularies.

A RDF or SIOC files itself won't deal with that since you expose all
the data, but it will do if you decide to export only a small part (as
I just mentioned in another thread [1] by exporting files
dynamically).

Regarding data manipulation, if the service natively stores the data
in RDF with some triple store, you can query it with SPARQL [2],
update it with SPARUL [3], and deal with those issues at the time of
querying.
But as that's not the case, data manipulation and access control have
to be taken into account somwhere. So here, WRFS could offer a
standardised abstraction layer than then can be mapped with those
languages / protocols but also with other querying / updating APIs (ex
flickr API) ?

Alex.

[1] http://groups.google.com/group/dataportability-public/browse_thread/thread/ce40688c7dd8ad86
[2] http://www.w3.org/TR/rdf-sparql-query/
[3] http://www.hpl.hp.com/techreports/2007/HPL-2007-102.html?mtxs=rss-hpl-tr

leo...@gmail.com

unread,

Jan 8, 2008, 1:03:15 PM1/8/08

to DataPortability.Public.General

Hi,

As also posted on top of http://groups.google.com/group/dataportability-public/web/WRFS%20-%20Web%20Inode%20Overview
I would say that RDF has the onboard features to do this, as also Alex
said before.

Alex, you missed the part of rdfs:seeAlso for pointing to related URIs
containing more
information about the same resource (its nearly the same as owl:sameAs
anyway).

Zef,

On Jan 7, 12:33 pm, "Zef Hemel" <zefhe...@gmail.com> wrote:
> Hi Alex,
>
> I think the SIOC work can be of great help with the meta-data aspect
> of data portability. One issue of WRFS it does not address (I think)
> is a standardized interface to actually get access to this data.
> Retrieval of data is kind of obvious (HTTP GETs), although, how does
> it deal with privacy? I might want to give a certain app access to
> some of my private photos on flickr, does SIOC deal with that? Is
> there a notion of access control? And then of course there's the
> actual data manipulation (PUT and DELETE) of data. These are the
> actual storage needs of WRFS, but for quite a bit of the rest we might
> be able to use quite some existing RDF/OWL vocabularies.
>
> I must say that you're talking me back into the idea of using RDF as a
> means to represent relationships between data items. This is how I
> started off thinking about (and in fact implementing) WebFS. Later I
> thought it might be better to put data in Atom collections and
> represent the meta data in atom feeds and use the AtomPub protocol
> [1], because you get so much for free (all the PUT, POST, GET and
> DELETE stuff is already specified). But now I'm not so sure.

SIOC and RDF on the level of "xml-format", but more clever in terms of
web scalability (they are also a database format, etc).
For manipulating the data, any RESTafarian interface is good.

[1] does not say much about editing anyway (conflict resolvement, what
if you use non-defined XML entities, etc)
http://tools.ietf.org/html/rfc5023#section-5.4.2

It just says "PUT" a new XML to the URI, which is the same as REST
does anyway.
If you PUT ATOM/XML or RDF/XML does not matter much.

I didn't dig deep into RFC5023, but it looks like normal REST to me,
so just use it but don't restrict yourself to ATOM as data format but
also allow RDF, then you are open to the world.

Note, that with RDF (and ATOM maybe also) you may have a problem:
what if you change one value, there is no UPDATE

example:
PUT (some RDF/ATOM)
<Person rdf:about="http://www.leobard.net/rdf/foaf.xml#me">
<name>Leo</name>
<mbox rdf:resource="mailto:leo...@gmail.com"/>
</Person>

then the user only edits the name and PUTs this:
<Person rdf:about="http://www.leobard.net/rdf/foaf.xml#me">
<mbox rdf:resource="mailto:leobard...@gmail.com"/>
</Person>

Would this second call delete the <name>Leo</name>?

In RDF, this is solved by allowing to delete/add individual elements
(= called statements).

Maybe you want to extend the REST thing with an UPDATE methodology to
be able to transfer only the changes made, not a new version of the
old data..... but alas, thats just the end-tuning bit you realize
after a longer time... you don't have to do it for starters

>
> Interesting.
>
> Zef
>
> [1]:http://tools.ietf.org/html/rfc5023
>

> On Jan 7, 2008 11:52 AM, Alexandre Passant <a...@passant.org> wrote:
>
>
>
>
>
> > Hi Josh,
>

> > [2]http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-wit...

> > [3]http://tools.opiumfield.com/twitter/terraces/rdf
> > [4]http://www.dcs.shef.ac.uk/~mrowe/foafgenerator.html

> > [5]http://www.johnbreslin.com/blog/2008/01/04/dataportabilityorg-web-sta...

Josh Patterson

unread,

Jan 8, 2008, 11:07:46 PM1/8/08

to DataPortability.Public.General

Alex,
Thats a lot to address, so let me just say this in one shot: This
project has been a work in motion for quite a while, and it was
started out of a need for certain functionality in our little startup
project (floe.tv - media mixer, UGC), and then we found common ground
with faraday as they wanted the same things, and said "hey, lets work
on this". And now its becoming this huge political thing played out in
the media outlets, which, I guess is how it goes.

The below post ended up being fairly big, but I'm really trying here
to balance being transparent and telling people where we're trying to
come from and where we want to go.

Now, for quite a while before i wrote that 11/20 document, I've read
semweb books, read TBL's blog and writing on RDF, studied SIOC, RDF,
and moreso inferencing. In fact, early on I told our team "I wanna use
SIOC, cause from what ive skimmed of it, its gonna do a lot of
things". Once I watched certain outlets, etc, I began to do my own
homework, and saw how there were a lot of those pieces in place, but
they were so loosely joined that there wasnt much traction there (at
least from what I saw). From my experience in the commercial realm, I
know that it always takes a "little more" to get that cohesiveness to
a new concept, to get it rolling. Something else I've come to realize
in discussing RDF with programmers and startups at large is --- they
just dont get relational algebra, mostly. RDF, inferencing, and
Relational Algebra are computer science things, and we have a
grassroots culture of hackers out there who just wanna hack. And thats
cool, the market has spoken --- but at the same time, I still believe
in the future of inferencing and the web as this huge, distributed,
relational database in the sky.

How do we make this approachable, really? RDF probably needs to have a
role, but its going to have to be under the hood in some aspects,
because we just cant warp some people's brain with it --- it simply is
fundamentally different than HTML or even XML, and I think thats where
it gets a LOT of people. They go "ok, i got html, and then i got xml,
RDF will be the same thing" and it doesnt work out that way. I mean, I
litterally have had that conversation with some people --- they dont
get really what to do with RDF, but when I saw "hey, we could use WRFS
to write a future web data permission system based on the open social
graph, which allowed you to share say 'photo A with only my immediate
family', and have that relationship 'infered' ---- and they instantly
light up and go 'now i want that'."

So I figure RDF has a role to play under the hood, but how? First I
figured we had to get to the data. Thats more work than most people
think to actually "put in play". So we put on our hardhats and started
hacking and designing.

And thats why I went to the drawing board, and said "ok. the internet
is the OS. we need data. its spread out, just like a filesystem throws
blocks around, and uses inodes to track them. we need an inode, we'll
call it a 'web inode' (which evolved to wNode to respect the past
technique, while saying its not the same thing, exactly) and use that
conceptually as a piece, but we wont freak out about how its
implemented, cause we got a good model, and it will work."

I think the major thing that I wanted to add in our early sketches
(and why we strayed from that stock RDF / FOAF design) is that we
needed to allow certain parties certain things. OAuth does that, and
Eran and his group have kicked ass in that regard, and we're working
closely with them to take advantage of all the tremendous work they've
laid down --- as well as the openID team. Another thing was that I
never saw a good way for a third party to write to the FOAF file. Yes,
I could write a php script that takes some params and an openID token,
and updated a flat file, but no "this is a standard protocol" way
(doesnt mean it doesnt exist, just means i couldnt find it). So I said
"good designs need a model, make the abstraction, the model, make a
stack, fill in the layers".

The key things to the wNode resource are: a data container has to be
able to update it securely (three party transaction, OAuth +
Discovery) on your behalf so other apps can find it (linking nothing
new, but we simply tried to standardize the WHOLE process) and then
being able to query the wNode; "who can see what?" With most foaf
files ive seen in the wild, its just a flat file sitting on a
webserver. I dont want certain groups seeing certain things. A key
tenant of what we wanted to do was, and Chris Messina stated it best

http://factoryjoe.com/blog/2007/11/26/data-banks-data-brokers-and-citizen-bargaining-power/
http://factoryjoe.com/blog/2007/11/26/data-portability-and-thinking-ahead-to-2008/

in his blog posts, that we need to be able to "own our own identity".
Doc Searls also had a couple of really good posts on controlling one's
own data,

http://blogs.law.harvard.edu/doc/2007/11/25/time-to-write-our-own-rules/

so we knew we had to have something that allowed secured management of
that wNode, of that FOAF file, of that personal identity index. wNode
became that resource.

Could a wNode be an web resource and render itself as a FOAF file?
yeah. sure it could. will we make that part of the spec? We're not
sure yet, although it wouldnt be that hard to do both XRDS and FOAF.
Do we want to make it that complicated out of the box? I'm not sure
about that yet, and me and paul have been hacking for a while on this,
so up til all this attention, its just been a few dudes with a sketch
and some spare time going "lets solve this problem, its pesky, it
helps our startups, and is sorta fun".

However, really overall I guess the key to this group, really is ---
"dont panic. chill out, its gonna work, let the politics roll how they
may, we can't control the blogosphere" (ok, well, maybe Saad can, but
I can't).

So now that we've sorta entered this phase of intense scrutiny ( a
little earlier than I had expected, seems like scoble drama has been a
catalyst ), it puts a little more pressure on things. The main
engineering group (Me, JLewis, Paul, Zef) have been hammering away at
a core skeleton spec, draft 1. I am committed to getting this "right",
but at the same time, getting something out "there", as messina has
advocated, for people to chew on. --- "Release Often".

Here's what I propose to you: I know our groups have some ideas, and a
lot more common ground than we both realize. As much as I hate to use
a "Bush-ism" --- "I'm a uniter, not a divider!". Right now, in the
spec sitting in subversion, (that is very very rough) the wNode is
listed as rendering itself as XRDS. In a few days, its gonna get
released publically, draft 1, in all its sketch-ness.

If you would, please take a hard look at it, and just objectively see
what you think.

I think you'll see that we've given quite a bit of consideration to
the RDF camp. quite a bit. Also, we've given quite a bit of thinking
to the non-RDF camp, etc, too. I think we have something that more
people than not can live with.

I just want everyone in this group to commit to one thing --- being
able to make the right comprimises to get the job done. Regardless, I
just felt like I should make my pitch to the RDF camp and say "we can
make this work. lets do this", so gimme a few days, and then we'll see
what happens.

Josh

On Jan 7, 5:52 am, "Alexandre Passant" <a...@passant.org> wrote:
> Hi Josh,
>

> [2]http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-wit...

> [3]http://tools.opiumfield.com/twitter/terraces/rdf
> [4]http://www.dcs.shef.ac.uk/~mrowe/foafgenerator.html