View this page "WRFS - An Idea"

8 views
Skip to first unread message

Chris Saad

unread,
Nov 11, 2007, 11:29:27 PM11/11/07
to DataPortability

Click on http://groups.google.com/group/dataportability/web/wrfs---an-idea
- or copy & paste it into your browser's address bar if that doesn't
work.

Paul Jones

unread,
Nov 12, 2007, 3:46:41 PM11/12/07
to DataPortability
I've actually wondered about this idea myself a couple of times.
Unfortunately, I keep getting stuck on one part of the implementation
- binding the user's data to their identity.

As far as I've seen, the OpenID protocol doesn't seem to require the
server to be able to store arbitrary textual data for an account. The
simple reg extension seems to have fixed fields, none of which seem to
be easily appropriatable.

So I'm curious if there are any ideas floating around about how it
would be possible to bind this data to an account, without having the
account provide it? And keeping to the goal of making it de-
centralized...

Paul

On Nov 12, 4:29 am, Chris Saad <chris.s...@gmail.com> wrote:
> Click onhttp://groups.google.com/group/dataportability/web/wrfs---an-idea

Josh Patterson

unread,
Nov 12, 2007, 4:42:09 PM11/12/07
to DataPortability
Paul,
I think the major thing we found in our "design exercise" was that the
identity provider needed to only point at the "web inode" entity. I
havent read the complete openID spec, but the demand is there for
"data portability" so maybe someone will be motivated to make an
extension to their spec, or create a new identity provider (although
I'd hope we could just get a field in the openID provider to point
towards the user's data aggregation store / "web inode" ).

Josh Patterson

Paul Jones

unread,
Nov 12, 2007, 4:54:21 PM11/12/07
to datapor...@googlegroups.com
Unfortunately, adding even just one piece of data is a change... My concern is that there is already quite a significant number of different (and, I'd hazard to guess, custom implemented) OpenID servers in the wild, and the chances of getting a decent majority to implement these could be quite difficult.

As an end user, I would certainly find it disappointing if I had an OpenId, went to log into a service connected to this, and was promptly told "sorry, your OpenId isn't good enough". Sure, I get that you can request that your provider implement support, but it would definitely hinder uptake.

My apologies for not being able to back this up with a strong alternate solution... The only idea I have coming to mind is a bit left-field - we could consider using something like the DHT protocol (as seen in your local bittorrent client), and allowing various web servers to implement "adapters" to this service, whereby they provide HTTP-front ends to this non-HTTP protocol.

Paul.

Josh Patterson

unread,
Nov 12, 2007, 9:36:22 PM11/12/07
to DataPortability
Paul,
Most definitely, yes, change is generally difficult, and there is
always a high cost to it (even more so with entrenched brands,
products, or technologies). I think we could agree, however, that this
is a direction that would be interesting to go in. With that stated,
we can begin to outlay the number of optional paths that take us
towards our desired utility or function, and then evaluate the cost of
each path. I think you will find that the cost of change (here an
update to a spec) will be exceeded in some cases by what there is to
gain. So, just like we wrote out our thoughts to get a clearer picture
on what we were up against, I say we simply say "we want to go in this
direction, lets map out a few paths". Once the potential for growth
and gain becomes apparent, I'd say the needed change won't be so hard
to get people to buy into, especially since there are always
tremendous market pressures to adapt and grow.

In the end I think it will simply come down to this: "I want what the
other guy has, and I have to work harder to get it. So whats it gonna
take?"

I feel this group has the potential to figure out what its gonna
take.

Josh

On Nov 12, 4:54 pm, "Paul Jones" <pauljone...@gmail.com> wrote:
> Unfortunately, adding even just one piece of data is a change... My concern
> is that there is already quite a significant number of different (and, I'd
> hazard to guess, custom implemented) OpenID servers in the wild, and the
> chances of getting a decent majority to implement these could be quite
> difficult.
>
> As an end user, I would certainly find it disappointing if I had an OpenId,
> went to log into a service connected to this, and was promptly told "sorry,
> your OpenId isn't good enough". Sure, I get that you can request that your
> provider implement support, but it would definitely hinder uptake.
>
> My apologies for not being able to back this up with a strong alternate
> solution... The only idea I have coming to mind is a bit left-field - we
> could consider using something like the DHT protocol (as seen in your local
> bittorrent client), and allowing various web servers to implement "adapters"
> to this service, whereby they provide HTTP-front ends to this non-HTTP
> protocol.
>
> Paul.
>

Josh Patterson

unread,
Nov 12, 2007, 9:42:46 PM11/12/07
to DataPortability
Paul,
But at the same time, your point is very valid, and something I will
do is sit down and take that point of view. I'll add a new section to
our writeup and just write it out. Basically, "what if we cant change
the identity provider to point at the 'web inode'? What are the other
options there? How does that affect the utility provided?" and just
use it as an opportunity to make the writeup stronger by listing the
pros and cons of using an updated identity provider service, vs not,
and what design tradeoffs can be made. I think once we write that out,
and sketch out a new chapter, our next steps will become clearer. But
let's just commit to keeping the ideas and the analysis of the idea
going, and I think we'll get somewhere good if we do.

Josh

On Nov 12, 4:54 pm, "Paul Jones" <pauljone...@gmail.com> wrote:
> Unfortunately, adding even just one piece of data is a change... My concern
> is that there is already quite a significant number of different (and, I'd
> hazard to guess, custom implemented) OpenID servers in the wild, and the
> chances of getting a decent majority to implement these could be quite
> difficult.
>
> As an end user, I would certainly find it disappointing if I had an OpenId,
> went to log into a service connected to this, and was promptly told "sorry,
> your OpenId isn't good enough". Sure, I get that you can request that your
> provider implement support, but it would definitely hinder uptake.
>
> My apologies for not being able to back this up with a strong alternate
> solution... The only idea I have coming to mind is a bit left-field - we
> could consider using something like the DHT protocol (as seen in your local
> bittorrent client), and allowing various web servers to implement "adapters"
> to this service, whereby they provide HTTP-front ends to this non-HTTP
> protocol.
>
> Paul.
>

Josh Patterson

unread,
Nov 13, 2007, 11:25:14 PM11/13/07
to DataPortability
Can anyone tell me if the YADIS protocol would completely fullfill the
"web inode" functionality/requirement? It seems like it is a very very
good candidate, and I'd love to hear people's opinions on it.

Josh

On Nov 11, 11:29 pm, Chris Saad <chris.s...@gmail.com> wrote:
> Click onhttp://groups.google.com/group/dataportability/web/wrfs---an-idea

Paul Jones

unread,
Nov 14, 2007, 7:20:12 AM11/14/07
to datapor...@googlegroups.com
Looking at it, I don't really see any reason why not. It also provides a fairly good integration point with existing OpenID providers - they just add a service into their existing XRDS files.

A secondary integration point could be allowing users to link to some other YADIS file on a different server - seperate to their openid. That would solve the problem where the user has a delegate address, and an OpenID provider not already supporting this stuff.

I'm still actually quite into my idea of using a distributed protocol as a third mechanism for discovering the inode - it could allow us to get complete coverage whilst the provides come up to spec.

Paul.

Josh Patterson

unread,
Nov 14, 2007, 10:52:33 PM11/14/07
to DataPortability
Yeah, breaking out the inode is very doable --- and really, thats why
we went with an abstract model first, to sorta get our design, needs,
constraints, and thoughts in order, so moving pieces around is easier.
As long as that logical unit/function exists somewhere, bundled with
the openID entity or not, it still fits the model and maintains
"design integrity". I think really at this point the only thing we are
missing is a "standard web api toolkit" for data providers, at least
for a reference implementation. Or ---- at least a way to expose a
commonly known interface to query to get the WSDL-type information
from the "storage container entity". Maybe this is exposed via RDF?
(can you tell that I'm big on the semantic web tech? lol)

On Nov 14, 7:20 am, "Paul Jones" <pauljone...@gmail.com> wrote:
> Looking at it, I don't really see any reason why not. It also provides a
> fairly good integration point with existing OpenID providers - they just add
> a service into their existing XRDS files.
>
> A secondary integration point could be allowing users to link to some other
> YADIS file on a different server - seperate to their openid. That would
> solve the problem where the user has a delegate address, and an OpenID
> provider not already supporting this stuff.
>
> I'm still actually quite into my idea of using a distributed protocol as a
> third mechanism for discovering the inode - it could allow us to get
> complete coverage whilst the provides come up to spec.
>
> Paul.
>

Chris Saad

unread,
Dec 30, 2007, 1:15:02 AM12/30/07
to DataPortability Workgroup
Does FTP have a role to play in this at all?

Maybe not... Just a thought
> > > > work.- Hide quoted text -
>
> - Show quoted text -

David P. Novakovic

unread,
Dec 30, 2007, 3:23:08 AM12/30/07
to datapor...@googlegroups.com
To throw something more on the client end into the discussion, I think
a unix technology called FUSE is very relevant here.

http://en.wikipedia.org/wiki/Filesystem_in_Userspace

FUSE basically allows you to run a filesystem driver in userspace. For
example, people wrote a wrapper for the windows ntfs binary driver and
used it as a way for unix systems to run NTFS filesystem.

In essence it is a way of being able to take ANYTHING and represent it
as a filesystem. I could write an adapter for say, a MySQL database,
and then browse folders as tables and files as records.

Lets take it one step further, I could represent my Facebook account
as a series of folders with files for photos, notes etc. I could
browse other peoples profiles in much the same way, except with
different permissions.

FUSE has bindings in many languages, I'm a pythonista, so I know for
sure there are python bindings. I'm not sure if there is an equivalent
for windows based systems...

Anyway, FUSE is cool, I just thought I'd throw it out there. I've been
thinking about doing something like this with FUSE for a while.

David

On Dec 30, 2007 4:15 PM, Chris Saad <chris...@gmail.com> wrote:
>
> Does FTP have a role to play in this at all?
>
> Maybe not... Just a thought
>

Josh Patterson

unread,
Dec 30, 2007, 2:47:47 PM12/30/07
to DataPortability Workgroup
If you take a look at the WRFS doc, you'll see that this would fit on
top of the existing design abstraction stack, quite easily. Its built
in a layered manner, ala OSI Network Model, so you could basically
treat the web as a logical drive on your pc. However, consider, as
Danny Ayers has pointed out, web data is more complex than traditional
filesystem data --- its interlinked, and relational in nature via
URI's, etc. Filesystem data is simply treated as blocks on disk
referenced by an inode, and so I began to address the relational
aspect of that with the first draft of the model.

Josh

On Dec 30, 3:23 am, "David P. Novakovic" <davidnovako...@gmail.com>
wrote:
> To throw something more on the client end into the discussion, I think
> a unix technology called FUSE is very relevant here.
>
> http://en.wikipedia.org/wiki/Filesystem_in_Userspace
>
> FUSE basically allows you to run a filesystem driver in userspace. For
> example, people wrote a wrapper for the windows ntfs binary driver and
> used it as a way for unix systems to run NTFS filesystem.
>
> In essence it is a way of being able to take ANYTHING and represent it
> as a filesystem. I could write an adapter for say, a MySQL database,
> and then browse folders as tables and files as records.
>
> Lets take it one step further, I could represent my Facebook account
> as a series of folders with files for photos, notes etc. I could
> browse other peoples profiles in much the same way, except with
> different permissions.
>
> FUSE has bindings in many languages, I'm a pythonista, so I know for
> sure there are python bindings. I'm not sure if there is an equivalent
> for windows based systems...
>
> Anyway, FUSE is cool, I just thought I'd throw it out there. I've been
> thinking about doing something like this with FUSE for a while.
>
> David
>

David P. Novakovic

unread,
Dec 30, 2007, 5:11:34 PM12/30/07
to datapor...@googlegroups.com
Agreed, FUSE is just the technology, how it maps to web data is
completely up to whoever writes the fuse driver. Unix filesystems are
interlinking in nature, and many relational operations can be
represented by a hierarchy, especially URI based ones, it depends how
you implement the driver. Blocks are irrelevant to FUSE, it is not
about abstracting blocks (an abstraction for cylinders and heads), it
is about abstracting anything into a file like structure. relational
requirements have not stopped people from doing useful things with
XML.

For example, GmailFS is a FUSE driver written in python:
http://en.wikipedia.org/wiki/GmailFS

or

iPodDisk: Uses the MacFUSE system to display the iPod's hidden and
obfuscated file system as if it were a well-organized music directory,
also allowing users to copy files from an iPod to another disk

Though as you say, the interesting thing is that once it is written,
any application can use the file-like structure since it exists at the
OS level, not in some custom application.

Like i said, this is very much at the client end, and very specific to
a particular set of platforms, so much work is needed in the
abstraction.

David

Josh Patterson

unread,
Dec 30, 2007, 10:19:00 PM12/30/07
to DataPortability Workgroup
Yeah, when I was studying traditional file systems in unix in grad
school, I began to take a look at how those are evolving with respect
to wide area networks, and did a little chart in one of my design
drafts;

http://groups.google.com/group/dataportability-public/web/WRFS%20-%20Web%20Inode%20Overview

the gmailFS is in there, its pretty neat.

Josh

On Dec 30, 5:11 pm, "David P. Novakovic" <davidnovako...@gmail.com>
wrote:
> Agreed, FUSE is just the technology, how it maps to web data is
> completely up to whoever writes the fuse driver. Unix filesystems are
> interlinking in nature, and many relational operations can be
> represented by a hierarchy, especially URI based ones, it depends how
> you implement the driver. Blocks are irrelevant to FUSE, it is not
> about abstracting blocks (an abstraction for cylinders and heads), it
> is about abstracting anything into a file like structure. relational
> requirements have not stopped people from doing useful things with
> XML.
>
> For example, GmailFS is a FUSE driver written in python:http://en.wikipedia.org/wiki/GmailFS
>
> or
>
> iPodDisk: Uses the MacFUSE system to display the iPod's hidden and
> obfuscated file system as if it were a well-organized music directory,
> also allowing users to copy files from an iPod to another disk
>
> Though as you say, the interesting thing is that once it is written,
> any application can use the file-like structure since it exists at the
> OS level, not in some custom application.
>
> Like i said, this is very much at the client end, and very specific to
> a particular set of platforms, so much work is needed in the
> abstraction.
>
> David
>

Zef Hemel

unread,
Dec 31, 2007, 4:03:57 AM12/31/07
to datapor...@googlegroups.com
Hi all,

Before I join the WRFS discussion, I'll first give you a short
overview of what I had in mind with WebFS. Then we can see if there
are things you like and we can move from there, good idea?

I believe the idea behind WRFS and WebFS were quite similar, although
I never really thought about the data discovery bit much, but it's
easy to integrate with the WebFS ideas I already had (with OpenID). So
here we go.

One of the main principles behind WebFS is simplicity. It should be a
snap to implement a WebFS storage point and a snap to manipulate data
on storage points (from a client). The reason is adoption. If it takes
a lot of engineering to implement WebFS/WRFS they will think twice if
it's worth the effort. Adoption, I think, is a big challenge with
standards so it should be taken into account. How do we make things
simple? A few ways:
1. Make the protocol extremely simple, so it's easy to implement
2. Provide developers with libraries in their language of choice (PHP,
Python, Ruby, Java, .NET ...) -- a lot of work
3. Base WebFS on standards that already exist (= are well known and
well-supported with libraries)

A short while ago I decided option 3 was the best way. The standard
that seemed to fit best is Atom and the Atom Publishing Protocol. That
means that:
1. Every data item gets its own URI (I would even say URL)
2. Items are aggregated in atom feeds (or "collections" as I call
them), which are similar to directories in FS talk. Items can be data
items (like in 1) or other feeds/collections (like in 2). All items
are referred to with URIs themselves, of course so you can have a
complex hierarchy even jumping from one storage point to another
(you're basically doing inter-server symlinking all the time).

When you think of it, this is a very natural and good fit. There is,
however, the issue of meta-data. Atom provides you with a number of
standard meta data (in the feed), such as title, author, description,
category and so on. For WebFS/WRFS this may not be enough. But that's
no problem at all, as it's an XML format we can define other meta-data
attributes in our own namespace and include that in the feeds. One
piece of metadata that is crucial is "type" or "kind", which defines
the kind of item that we are dealing with ("Image", "Document",
"Email" and so on, we should probably predefine a few of those).
At first I thought RDF was the answer to all our metadata needs, but
I'm not sure if there's that much to be won there (but that's up for
debate), but of course, we could embed RDF in an Atom feed as well.

Now, what do we gain from basing this on Atom? We get some cool things for free:
* We could browse websites with atom feeds (= almost all interesting
ones) as WebFS storage points and manipulate data right there (if they
support AtomPub, which will happen more and more)
* There are already libraries around that can do atompub (both client
and server) so we get that free.
* Modern blogging clients (that support AtomPub) are essentially
already WebFS clients, without knowing it.

Now some of the issues addressed by WRFS but not really by WebFS:
* Authentication
I'm no expert on this, but I guess oAuth can be used. Authorization is
another issue. I'm not a security expert so I have no answers here
yet.

* Data discovery
If we use OpenID, like in WRFS, we could attach a "root feed" to an
OpenID account. Which could, like in Chris' screenshot, be a list of
all the services used (flickr, zooomr, google calendar etc.) or
aggregate feeds (see "Querying" below). The question is where does
this feed come from, but I guess this can be administered by the
OpenID provider (or delegated to a, I don't know, WebFS/WRFS discovery
provider).

* Querying
I thought about this a lot and hadn't really decided anything on this
yet. As my main goal was simplicity I don't want to force WebFS
storage providers to implement a query engine, it would be an optional
thing. If it's not provided, external services could provide this
service for them -- a WebFS Indexer. But application-specific
implementation aside, the interface would probably be a query URL
(http://www.blogger.com/atom/feeds/2322321232132/search?q=category:personal
to return all "personal" blog posts for instance) that would return
yet another Atom feed containing the results. If we standardize the
query language it won't be difficult to create WebFS aggregate feeds
that return all images (everything of type "Image") from all the
user's storage points.

Opinions? Thoughts?

In case you hadn't read it, there are some more notes on
http://www.webfilesystem.org

Zef

--
Zef Hemel
E-Mail: z...@zefhemel.com
Phone: (+31) (0)6 156 19 280
Web: http://www.zefhemel.com

Josh Patterson

unread,
Dec 31, 2007, 2:29:26 PM12/31/07
to DataPortability Workgroup
It looks like you've done quite a bit looking at the "disk block"
level, and thats good. For now, our engineering focus is own the first
draft of WRFS, with openID -> Yadis -> web inode, which points to the
locations of data around the web for that particular user (critical
tenant of WRFS). I'd say the best way for you to get involved at an
engineering level right now would be to work on how data is exposed
from a data container (since you've worked so much with ATOM in your
design), and how that interacts with the web inode (root feed, as you
put it).

http://groups.google.com/group/dataportability-public/web/WRFS%20-%20Web%20Inode%20Overview

I'm a big fan of Occam's razor, so the ATOM idea is really good for
simplicity (and its an area that I can value, but know little of), and
as you have pointed out, it provides some basic triples for
inferencing. Keep in mind that we (Paul, JLewis, and myself) have our
own engineering efforts at hand, and if you have ideas to add thats
great --- but we are really looking for people who want to write code.
For the time being in terms of consistency, I'd really like to keep
this prototype named WRFS, but if you want to put on your hardhat and
go to work, then you will be fully credited for your designs and
ideas. I think you could really be an asset to an area of where we
need some more work, basically, exposing data at the container level,
and the mechanics of how to best do that --- and ATOM sounds like a
good start. I am continuing to read through the webFS documents.

Josh
> (http://www.blogger.com/atom/feeds/2322321232132/search?q=category:per...
> to return all "personal" blog posts for instance) that would return
> yet another Atom feed containing the results. If we standardize the
> query language it won't be difficult to create WebFS aggregate feeds
> that return all images (everything of type "Image") from all the
> user's storage points.
>
> Opinions? Thoughts?
>
> In case you hadn't read it, there are some more notes onhttp://www.webfilesystem.org

Chris Saad

unread,
Dec 31, 2007, 10:01:18 PM12/31/07
to DataPortability Workgroup
That does sound like a good strategy Josh - in terms of Zef focusing
on the 'disk access' side since we have focused so much on discovery
so far.

Also, while code is critical and important, I think with this sort of
thing it will also be critical to make sure the ideas are documented
publicly as we go for everyone to see movement and momentum and to get
consensus as we go.

We've started here:
http://groups.google.com/group/dataportability-public/web

Chris

On Jan 1, 5:29 am, Josh Patterson <jpatter...@floe.tv> wrote:
> It looks like you've done quite a bit looking at the "disk block"
> level, and thats good. For now, our engineering focus is own the first
> draft of WRFS, with openID -> Yadis -> web inode, which points to the
> locations of data around the web for that particular user (critical
> tenant of WRFS). I'd say the best way for you to get involved at an
> engineering level right now would be to work on how data is exposed
> from a data container (since you've worked so much with ATOM in your
> design), and how that interacts with the web inode (root feed, as you
> put it).
>
> http://groups.google.com/group/dataportability-public/web/WRFS%20-%20...
> > Web:http://www.zefhemel.com- Hide quoted text -

Zef Hemel

unread,
Jan 2, 2008, 12:20:51 PM1/2/08
to datapor...@googlegroups.com
Hi Josh,

> It looks like you've done quite a bit looking at the "disk block"
> level, and thats good. For now, our engineering focus is own the first
> draft of WRFS, with openID -> Yadis -> web inode, which points to the
> locations of data around the web for that particular user (critical
> tenant of WRFS). I'd say the best way for you to get involved at an
> engineering level right now would be to work on how data is exposed
> from a data container (since you've worked so much with ATOM in your
> design), and how that interacts with the web inode (root feed, as you
> put it).

Ok great, I'm glad to hear we haven't been doing exactly the same
thing -- that would have been a waste.

> I'm a big fan of Occam's razor, so the ATOM idea is really good for
> simplicity (and its an area that I can value, but know little of), and
> as you have pointed out, it provides some basic triples for
> inferencing. Keep in mind that we (Paul, JLewis, and myself) have our
> own engineering efforts at hand, and if you have ideas to add thats
> great --- but we are really looking for people who want to write code.

You are writing code? Aren't we merely coming up with protocols here,
which first have to be designed? Or is most of that work done and have
you moved on to a reference implementation? If you simply need a
programmer I'm not your guy. I write code during the day already
(albeit somewhat different code) and am more interested in the design
than coding. I am willing to do some prototyping work of course.

> For the time being in terms of consistency, I'd really like to keep
> this prototype named WRFS, but if you want to put on your hardhat and
> go to work, then you will be fully credited for your designs and
> ideas. I think you could really be an asset to an area of where we
> need some more work, basically, exposing data at the container level,
> and the mechanics of how to best do that --- and ATOM sounds like a
> good start.

To keep calling it WRFS for now seems fine, although I have some
opinions of my own (I think WebFS sounds more catchy) -- but I can get
over that. We should talk so we can see what has been done and what's
still left to do.

Reply all
Reply to author
Forward
0 new messages