Integrating with Facebook through their nascent external API.

Stephane Daury

unread,

Jan 26, 2008, 11:58:02 AM1/26/08

to DiSo Project

http://tekartist.org/blog/2008/01/26/facebook-pulling-an-open-social/

Chris was mentioning a TBD integration between DiSo and Open Social in
the existential interview.
Turns out the same might become worth considering with Facebook,
should it be technically and legally feasible through the new JS
lib(s).

Any preliminary thoughts (just came out last night)?

Personally, and as much as I love JS, I'm still a bit saddened that
these all live in the client-side realm, but hey, I'll take what they
give, for now. ;)

Josh Patterson

unread,

Jan 26, 2008, 2:53:11 PM1/26/08

to DiSo Project

I'd say the major reason a lot of "web data" / "Graph" apis live on
the client side is the fact that if you want to start doing those
calls server side, well, thats gonna end up killing a server a lot
more quickly than pushing that to the client. As web apps continue to
become more complex and interlinked, the client is going to have to do
more from a simple scalability standpoint, as server capacity will not
increase at the same rate as demand for interlinked web data.

Most of what we have planned for WRFS, at least in my mind, has always
been from a third party app at runtime -- on the client. I think one
of those simple examples of this is how google ads, an early "killer
web app" is pulled from google at the client via js -> html injection,
and not pulled at the server.

Josh Patterson

Stephane Daury

unread,

Jan 26, 2008, 3:57:46 PM1/26/08

to diso-p...@googlegroups.com

Hey Josh,

I wholeheartedly agree with you on the values of client-side
integration, but I would welcome parallel server side hooks to open
extended development avenues.

Think Digg API or CouchDB, using REST/JSON as an exchange mechanism,
allowing for both client and server side communications.

From a scalability perspective on the server side, I'd say XMPP or
even Facebook's own OSS thrift (http://developers.facebook.com/
thrift/) could be elegantly leveraged to this end.

---
Stephane Daury - http://tekartist.org/

Josh Patterson

unread,

Jan 26, 2008, 4:31:27 PM1/26/08

to DiSo Project

I guess what I'm really getting at here, is, that no matter the
protocol, if you are expecting the server to render a page composited
from multiple external sources (say, your social graph fragments from
3 different locations / servers), that server now has to make 3
seperate socket calls for each xml/soap/rest/xmpp payload. thats 50ms
(for one round trip, if we are hitting 3, then with overlap it might
be more like 90ms?) on average that the server is holding those
resources hostage, and keeping that connection open to your client who
requested the original composited page render. thats bad news --- 50ms
is an eternity in computer time.

The alternative being, which is my guess why they are pushing js
libraries at least for now, is that the server can respond immediately
to the original request, dump the html into the response stream, and
close the connection, and its done. the client now doesnt negotiate
with that server, it talks directly to the 3 servers hold social graph
fragments.

The alternative is to have some sort of RSS like periodic server
caching of your information that is linked, but I think thats a dead
end as well, since we are heading towards having a LOT more
information out there about us. Linked data is coming, and caching a
lot of it will not bode well for your rank and file webservers.

I guess what I'm trying to say is --- REST, JSON, SOAP --- whoever's
API, it doesnt matter. still its about the resource economics of your
webservers, and how second tier "external" calls will make your
"algorithm" scale more poorly. (Unless I'm missing some other tech?
but http is http, generally)

Josh Patterson

On Jan 26, 3:57 pm, Stephane Daury <stephane.da...@gmail.com> wrote:
> Hey Josh,
>
> I wholeheartedly agree with you on the values of client-side
> integration, but I would welcome parallel server side hooks to open
> extended development avenues.
>
> Think Digg API or CouchDB, using REST/JSON as an exchange mechanism,
> allowing for both client and server side communications.
>
> From a scalability perspective on the server side, I'd say XMPP or
> even Facebook's own OSS thrift (http://developers.facebook.com/
> thrift/) could be elegantly leveraged to this end.
>
> ---

> Stephane Daury -http://tekartist.org/

Josh Patterson

unread,

Jan 26, 2008, 4:39:03 PM1/26/08

to DiSo Project

ok --- however,the one place the server side stuff might be ok is if
you are running a personal web server, and using that to run a blog or
something. then the server (outside of being highly traffic'd) could
afford to wait on the added resource hit. But if you were running,
say, a web app that had 10k users a day, and was heavily dependent on
other sources for data, then this is where your problems in-lay in
terms of your server's scalability.

For instance, on one of our development sites one of our developers
made a web service call at the server to our sharepoint site to get
some bug list information, and display that on the page (rendered at
the server) --- it killed the page's performance (went from like 20ms
render time to like 750ms render times, in some cases, depending on
how sluggish sharepoint decided to be that day). That simply would not
hold up for our production application under any kind of load, but it
"worked" in terms of being a simple debug app (although i was less
than thrilled when i first saw it.).

Josh

On Jan 26, 3:57 pm, Stephane Daury <stephane.da...@gmail.com> wrote:

> Hey Josh,
>
> I wholeheartedly agree with you on the values of client-side
> integration, but I would welcome parallel server side hooks to open
> extended development avenues.
>
> Think Digg API or CouchDB, using REST/JSON as an exchange mechanism,
> allowing for both client and server side communications.
>
> From a scalability perspective on the server side, I'd say XMPP or
> even Facebook's own OSS thrift (http://developers.facebook.com/
> thrift/) could be elegantly leveraged to this end.
>
> ---

> Stephane Daury -http://tekartist.org/

Chris Messina

unread,

Jan 26, 2008, 5:31:54 PM1/26/08

to diso-p...@googlegroups.com

On Jan 26, 2008 1:39 PM, Josh Patterson <jpatt...@floe.tv> wrote:
>
> ok --- however,the one place the server side stuff might be ok is if
> you are running a personal web server, and using that to run a blog or
> something. then the server (outside of being highly traffic'd) could
> afford to wait on the added resource hit. But if you were running,
> say, a web app that had 10k users a day, and was heavily dependent on
> other sources for data, then this is where your problems in-lay in
> terms of your server's scalability.

Well, this more closely resembles the DiSo model, where there are
bite-sized chunks of information, being stored as references, on one's
own blog, and then are cobbled together at run-time and cached.
Javascript certainly can be the delivery vehicle for such calls, and
perhaps a XMPP-over-HTTP stack (perhaps using JS?) would work for a
lot of basic tasks, like filling out a profile on demand or populating
one's personal newsfeed... paging would probably be the greatest
challenge, but at the least the initial build of pages would be fairly
simple, even when drawn from distributed sources.

I'm very eager to dive into the platform models of the Facebook JS
Client library [1] and OpenSocial's Shindig effort [2].

Chris

[1] http://wiki.developers.facebook.com/index.php/JavaScript_Client_Library
[2] http://svn.apache.org/repos/asf/incubator/shindig/trunk/

--
Chris Messina
Citizen-Participant &
Open Source Advocate-at-Large
Work: http://citizenagency.com
Blog: http://factoryjoe.com/blog
Cell: 412.225.1051
IM: factoryjoe
This email is: [ ] bloggable [X] ask first [ ] private

Stephane Daury

unread,

Jan 27, 2008, 6:41:17 PM1/27/08

to diso-p...@googlegroups.com

On Jan 26, 2008, at 17:31, Chris Messina wrote:

> On Jan 26, 2008 1:39 PM, Josh Patterson <jpatt...@floe.tv> wrote:
>>
>> ok --- however,the one place the server side stuff might be ok is if
>> you are running a personal web server, and using that to run a blog
>> or
>> something. then the server (outside of being highly traffic'd) could
>> afford to wait on the added resource hit. But if you were running,
>> say, a web app that had 10k users a day, and was heavily dependent on
>> other sources for data, then this is where your problems in-lay in
>> terms of your server's scalability.

For an app relying extensively on external data, all such APIs combine
both dynamic and immuable data objects, therefore lending themselves
nicely to leveraging both real-time and queue/cache based information
exchange. Just like leveraging memcaching to avoid hitting your
database too often.

With this in mind, it is technically more efficient/reliable for all
other parties if your server connects to the source once in a while
and caches what's current, rather than having each user fetch it
externally at avery page load. An example is how Google suggests/asks
to use server-side caching for geo coordinates when integrating with
the Google Map API, then make them available through JS to the client-
side mashup.

An activity stream is indeed very dynamic in nature from a
transactional point of view, but once something is committed to
"history", it theoretically will never change again (beyond being
deleted/restricted). So there's no real need to pull everything from
the external source, every time, for every user the site/app serves,
whether from the client or server side.

>>
> Well, this more closely resembles the DiSo model, where there are
> bite-sized chunks of information, being stored as references, on one's
> own blog, and then are cobbled together at run-time and cached.
> Javascript certainly can be the delivery vehicle for such calls, and
> perhaps a XMPP-over-HTTP stack (perhaps using JS?) would work for a
> lot of basic tasks, like filling out a profile on demand or populating
> one's personal newsfeed... paging would probably be the greatest
> challenge, but at the least the initial build of pages would be fairly
> simple, even when drawn from distributed sources.

Yup, agreed.

> I'm very eager to dive into the platform models of the Facebook JS
> Client library [1] and OpenSocial's Shindig effort [2].
>
> Chris
>
> [1] http://wiki.developers.facebook.com/index.php/JavaScript_Client_Library
> [2] http://svn.apache.org/repos/asf/incubator/shindig/trunk/

Same here. :)

Stephane

Chris Messina

unread,

Jan 30, 2008, 2:58:15 AM1/30/08

to diso-p...@googlegroups.com

On Jan 27, 2008 3:41 PM, Stephane Daury <stephan...@gmail.com> wrote:

> An activity stream is indeed very dynamic in nature from a
> transactional point of view, but once something is committed to
> "history", it theoretically will never change again (beyond being
> deleted/restricted). So there's no real need to pull everything from
> the external source, every time, for every user the site/app serves,
> whether from the client or server side.

One thing I'd like us to think about is how much "resolution" we need.
And therefore how much activity stream data we really need to hold on
to...

Put another way, we don't really need to store ALL remote data locally
(that would be insane). Instead, if we can grab data from many
services at the time of the request, we greatly limit the resources
required to make this system work. I don't know how this all scales,
but I do worry that we'll be grabbing all this data that people will
never see just because it's available.

I mean, when I don't check in to Twitter, I NEVER go back and see what
I missed (granted, I follow many people). I guess the point is, I let
a lot of data flow by me that doesn't need to be stored locally...
other people are different and do read every single update from their
friends and in their feed readers.

Do we have any ideas on this? Or how this scales on a person-to-person
basis? If we imagine social networks becoming distributed, what's the
likely average size of a personal social network? Five people? Thirty?
200?

Chris

Stephane Daury

unread,

Jan 30, 2008, 9:46:20 AM1/30/08

to diso-p...@googlegroups.com

On Jan 30, 2008, at 2:58, Chris Messina wrote:

>
> On Jan 27, 2008 3:41 PM, Stephane Daury <stephan...@gmail.com>
> wrote:
>
>> An activity stream is indeed very dynamic in nature from a
>> transactional point of view, but once something is committed to
>> "history", it theoretically will never change again (beyond being
>> deleted/restricted). So there's no real need to pull everything from
>> the external source, every time, for every user the site/app serves,
>> whether from the client or server side.
>
> One thing I'd like us to think about is how much "resolution" we need.
> And therefore how much activity stream data we really need to hold on
> to...
>
> Put another way, we don't really need to store ALL remote data locally
> (that would be insane). Instead, if we can grab data from many
> services at the time of the request, we greatly limit the resources
> required to make this system work. I don't know how this all scales,
> but I do worry that we'll be grabbing all this data that people will
> never see just because it's available.
>
> I mean, when I don't check in to Twitter, I NEVER go back and see what
> I missed (granted, I follow many people). I guess the point is, I let
> a lot of data flow by me that doesn't need to be stored locally...
> other people are different and do read every single update from their
> friends and in their feed readers.

That makes total sense to me.

The data I'd personally like to accumulate ad nauseam, and please
correct me if I'm missing DiSo's point (n00b here), is my own
distributed social activity.

Basically, the equivalent of FB's mini-feed as opposed to the news feed.

When I join groups/communities, submit data (media, posts, etc), I'd
really like this to follow the trackback model to my central install
(blog, thoruhg my OpenID identity) to map out my life stream. Similar
to what Steve and you were mentioning in "Re: xmpp-based bookmark
sharing / twitter tool from Sam Ruby" in regards to del.icio.us and
magnolia.

Person-to-person activity is what I see as flowing freely with no
"memory" (storage) involved beyond the basics. Maybe a simple system
that would let me flag (and potentially choose to share) the ones I do
want to archive out of that flow (ie: port free flowing objects to my
archived lifestream as a social event).

> Do we have any ideas on this? Or how this scales on a person-to-person
> basis? If we imagine social networks becoming distributed, what's the
> likely average size of a personal social network? Five people? Thirty?
> 200?

This implies a much smaller overall data set and far less resources.

Am I disillusioned? Missing the point? ;)

Stephane

Chris Messina

unread,

Jan 30, 2008, 11:48:47 AM1/30/08

to diso-p...@googlegroups.com

On Jan 30, 2008 6:46 AM, Stephane Daury <stephan...@gmail.com> wrote:

> The data I'd personally like to accumulate ad nauseam, and please
> correct me if I'm missing DiSo's point (n00b here), is my own
> distributed social activity.
>
> Basically, the equivalent of FB's mini-feed as opposed to the news feed.
>
> When I join groups/communities, submit data (media, posts, etc), I'd
> really like this to follow the trackback model to my central install
> (blog, thoruhg my OpenID identity) to map out my life stream. Similar
> to what Steve and you were mentioning in "Re: xmpp-based bookmark
> sharing / twitter tool from Sam Ruby" in regards to del.icio.us and
> magnolia.

Precisely!

In fact, I would maintain that the individual stores all the data
about her activities in a complete activity stream, either delegated
to a third party site (like Facebook, should they decide to implement
support for incoming activities a la a generalized Beacon API) or or
stored locally on her identity provider or personal DiSo site.

The personal activity stream should be able to be shared and
republished on per-item, per-site and per-class basis (i.e. per item:
"share this book review publicly", per site: "share all my Last.fm
listens with just my friends", per class: "share all my blog-like
posts with my business colleagues").

If each individual maintains her own activity stream and which
activities are to be published, theoretically you could always page
through the data they've made available, either remote from the
sources (as in, from the view of your own WordPress dashboard, which
would become, in effect, your personal newsfeed) or at the sources
themselves (say I want to see what Steve's been up to in the past 36
hours, rather than staying on my own site, I'd visit his redmonk.net
activity stream and sign in with my OpenID to see all the updates he's
chosen to share with me).

This model is essentially what I've been thinking about all along, but
this articulation helps advance the conversation, especially as it is
in line with Joseph Smarr's concept of individuals maintaining their
identifiers and services publishing, in open formats, friends list.
With clear delineation of responsibilities (in this case, the
individual is responsible for storing and publishing her own activity
stream and the DiSo model is to, at runtime, aggregate the most recent
activities from your friends), I think we can begin to design the
system right away, with less concern about massive storage demands put
on the individual.

How does this sound?

Stephane Daury

unread,

Jan 30, 2008, 2:42:06 PM1/30/08

to diso-p...@googlegroups.com

On Jan 30, 2008, at 11:48, Chris Messina wrote:

> With clear delineation of responsibilities (in this case, the
> individual is responsible for storing and publishing her own activity
> stream and the DiSo model is to, at runtime, aggregate the most recent
> activities from your friends), I think we can begin to design the
> system right away, with less concern about massive storage demands put
> on the individual.
>
> How does this sound?

That not only sounds great to me, it sounds viable, which is always a
great place to be. :)

Stephane

Reply all

Reply to author

Forward