Hi Kartik,
Sorry for the slow response, there's a lot going on here, but what
you're doing sounds very exciting and like a good fit for Fluidinfo.
On Thu, Jul 26, 2012 at 1:24 PM, Kartik Subbarao
<
kartik....@gmail.com> wrote:
> I've read up on the fluidinfo data model and played around with the
> fluidinfo explorer and some of the blog examples, but I'm still fairly
> new to the environment. My general sense is that I want to maximize
> the use of the extensive tagging/namespace capabilities, and minimize
> the amount of content that I have to manage as opaque text strings.
>
> Would it be possible to request an application account for this
> project? (I guess in the meantime I could create an
okfn.org/annotator
> namespace under my account and add the tags there).
You can create a domain user using the new user form:
https://fluidinfo.com/accounts/new/
You'll be asked to make a file accessible via your domain. During the
account confirmation process Fluidinfo will attempt to download this
file to verify that you control the domain. If you have any issues,
please let me know. That said, using a namespace for testing should
be workable.
> At first glance, I think I need to deal with two types of objects:
>
> 1) webpage objects. Here, I'd like to tap into what's already there.
> Is there a conventional tag name for the URL of a webpage object (url?
> uri?)
We use the URL in our own applications. In the case of browsers, we
use the 'document.location' value (without any sanitization) as the
about value for the object. I guess you know this, but in FluidDB all
objects have a UUID that uniquely represents them. They also
(optionally) have a fluiddb/about value, which can only be set once
when the object is created. It's a string that uniquely identifies
the object. So we say that the object for a webpage is the one where
fluiddb/about is the URL of that page.
A subtle detail about this approach is that the about values
'
http://fluidinfo.com' and '
http://fluidinfo.com/' are different
objects.
> 2) annotation objects. I'd be creating these objects and adding tags
> to them based the JSON format above.
>
> I'm thinking that I may want to add a tag to webpage objects that has
> a set of all of the annotation object IDs that reference it.
You can do that but you'll quickly run into race conditions and
possibly lose data (last writer wins in these kind of updates). It's
better to, on the annotation object, use a tag to refer to the web
page object. For example, you might have an
'
okfn.org/annotator/related-url' tag with the about value of the web
page object the annotation is about.
You can then fetch all the annotations with a query like:
okfn.org/annotator/related-url = "
http://fluidinfo.com"
And you'll get all the annotations for the '
http://fluidinfo.com'
object. Note, this convention is described here:
http://fluidinfo.com/cookbook/#related
> Does this approach make sense? Are there better options? If any of you
> could take a few minutes to look at the annotation format and storage
> interface, and recommend any particularly fluidinfo-savvy approaches
> to store the annotation-related information, I'd greatly appreciate
> it!
It looks like converting the JSON format for annotations into FluidDB
tags is going to be pretty straightforward. You can use namespaces
and tags to achieve basically the same layout and you can write these
with a single request by using the /values API:
http://api.fluidinfo.com/html/api.html#values_PUT
I suspect you'll be relying on /values a fair amount, since you can
perform batch operations with it. We've found that network latency
dominates the time it takes to perform most requests, so trying to
keep network calls to a minimum is encouraged.
Also, another issue is authentication. Are you going to store all
user data using an
okfn.org tag? If that's the case, will calls from
a user's browser go through a proxy of some kind (ie, to hide that
user's credentials)?
So far we've been talking about storing comments as tag values (along
with the other metadata that goes with the comment text), which is
generally a good approach. We've recently been working on an
application called loveme.do which aggregates comments from social
networks about hashtags and URLs and presents them in a nice uniform
way. For example,
http://loveme.do/about/%23bigdata
Probably most of the comments you see there will be from Twitter, but
they can also come from Disqus, Tumblr and Facebook (and in the
future, other services). It could be interesting for Annotator to be
one of the services that's pushing comments into Fluidinfo. In that
case, we create an object for each comment and we use a custom (so far
internal) API for creating them (which does some special analysis and
linking of the content). If this is interesting we could talk more
about how that API works and you could add your additional metadata to
the comment objects created by that API. A way to think about this is
that objects have two kinds of data: structured tag/value and a
comment stream.
Anyway, I hope this is useful and at least gives you the ability to
ask more questions to get closer to what you need to know. Please ask
questions, also, if you hop into #fluidinfo on Freenode we can talk
directly.
Thanks,
J.