[crisiscamppdx] Fwd: Early look at Annotations

1 view

Skip to first unread message

M. Edward (Ed) Borasky

unread,

Apr 17, 2010, 4:32:31 PM4/17/10

to crisis...@googlegroups.com, crisis...@googlegroups.com, tweak-t...@googlegroups.com, crisis...@googlegroups.com, swift...@googlegroups.com

For those of you who didn't make it to the Chirp Twitter developers'
conference, one of the exciting new capabilities Twitter is building
is called "annotations". The "engineering discussion" is taking place
on the Twitter Development Talk Google Group if any of you want to
join in. It looks to me like this is a golden opportunity for quite a
few of the Twitter-related "force for good" projects. Here's the
announcement:

---------- Forwarded message ----------
From: Marcel Molina <mar...@twitter.com>
Date: Apr 16, 10:54 am
Subject: Early look at Annotations
To: Twitter Development Talk

Hey everyone. One of the things we talked about at Chirp is the new
Annotations feature we're working on. In short, it allows you to
annotate a
tweet with structured metadata. We're still working on Annotations,
but I
wanted to share with a wider audience beyond those I was able to talk
to in
person at Chirp about how we're thinking of doing Annotations.

* What is an annotation more exactly exactly?

First off let's be clearer about what an annotation is. An annotation
is a
namespace, key, value triple. A tweet can have one or more
annotations.
Namespaces can have one or more key/value pairs.

* How do I specify what annotations a tweet should have?

Annotations are specified for a tweet when the tweet is created. When
submitting a POST to /statuses/update, you'll include an "annotations"
parameter with your annotations. We're thinking we'll provide two
mechanisms
for specifying what a tweet's annotations are:

1. JSON
2. form encoded parameters

* How big can an annotation be and how many annotations can I attach
to a
tweet?

There is no limit on the size of any given namespace, key or value but
the
entire set of all annotations for a given tweet can not exceed some
fixed
byte size. That size isn't set in stone yet. We will be starting small
(probably 512 bytes) and growing it gradually as we incrementally roll
out
the feature so we can gauge its scalability at various sizes. We'd
like to
(no promises) have it end up around 2K. How you use that 2K is up to
you.
You can attach one honking annotation, or a thousand+ tiny ones. You
can
attach one namespace with hundreds of key/value pairs, or hundreds of
namespaces with just one key/value pair. We want to keep things as
flexible
and open ended as possible.

* What kind of data can go into an annotation?

We'd like to allow for any arbitrary data to be stored in an
annotation.
Arbitrary Unicode? Sure. MIDI? Go for it. Emoji? Yes please! There
might be
some tricky edge cases though. Skip the rest of this paragraph if you
don't
care about the details of edge cases... For one, since these
annotations
will be serialized to, among other formats, XML, and we'd like to keep
the
XML succinct, the namespace and key components of an annotation triple
would
likely be an XML tag with its value as, well, its value. If that's the
case
then the data of the key must be a valid XML tag. This greatly limits
what
it can contain (not even spaces for example). If allowing all three
elements
of the triple to contain any arbitrary data is more important than a
succinct XML payload then we'll design a more verbose XML payload. Up
to you
all really. I've included examples of both options below. Make a case
for
another proposal if you have strong opinions.

* What constitutes a valid annotation?

Aside from the size and data type restrictions listed above, another
requirement is that namespaces and keys be non-empty values. Values,
on the
other hand, may be empty. In this way the namespace/key pair can be
treated
like a flag of sorts. It should be noted: I'd encourage everyone to
always
think of a namespace as a namespace, to think of a key as a key and to
think
of a value as a value. Don't take the fact that a value can be empty
to mean
that you can skip out on the whole namespace think and morph the
namespace
into a key and the key into a value. While open endedness and
flexibility is
a quality of the Annotations feature that I'm most excited about for
the
developer community, this kind of approach seems prone to causing
confusion
by undermining namespaces.

* What namespaces can I write to? What namespaces can I read from?

Anyone can write to or read from any namespace. We aren't planning on
enforcing any policy that restricts someone else from adding an
annotation
with "your" namespace or seeing annotations only if they are logged in
with
a certain account. In the absence of some really compelling reason to
do
that, we want to err on the side of making this feature as flexible
and open
ended as possible. Namespaces aren't intended as a way for people to
claim
their little slice of the tweet space. Rather they are intended to
dramatically increase the possible significance of a given key/value
pair.
If you want a given key to mean one thing and someone else wants that
same
key to mean something else, and someone else still wants another
meaning,
consumers of your annotations are put in a tricky spot trying to
figure out
how to interpret a given annotation without the disambiguation of a
namespace.

* How do we consume annotations?

For convenience, we plan on including annotations for a tweet directly
embedded into that tweet's payload. The XML payload of a tweet I just
inspected at random came out to about 2K in size. The "worst case"
annotation would a little more than double that payload to probably
about
5k. We're erring on the side of thinking that the moderate increase in
payload size for tweets with annotations, even on slow connections, is
both
more convenient and faster than the latency and inconvenience incurred
by
adding another HTTP round trip. Though we'd like to provide an
embedded and
non embedded option, the maintenance cost and fragment cache space
increase
makes supporting both likely unrealistic so we're going with what we
think
satisfies the 80% case. Push back as appropriate.

* What will the payloads look like?

This isn't final. The payloads could end up wildly different after we
noodle
around in things like RDF and the semantic web's literature and all
that
kind of stuff. You can't see me but my hands are waving vigorously.

Given a hypothetical tweet, "Just got 'Although Of Course You End Up
Becoming Yourself' in the mail. Hopeful. Heart broken."

JSON

'annotations':
{
'iso':
{
'isbn': '030759243X'
},
'amazon':
{
'url': 'http://www.amazon.com/Although-Course-You-Becoming-
Yourself/dp/030759...
}

}

XML option #1 which is succinct but restricts the possible values of
namespaces and keys

<annotations>
<iso>
<isbn>030759243X</isbn>
</iso>
<amazon>
<url>http://www.amazon.com/Although-Course-You-Becoming-Yourself/
dp/030759...
</url>
</amazon>
</annotations>

XML option #2 which is more verbose but allows for namespaces and
keys to
contain arbitrary data

<annotations>
<annotation>
<namespace>iso</namespace>
<key>isbn</key>
<value>030759243X</value>
</annotation>
<annotation>
<namespace>amazon</namespace>
<key>url</key>
<value>http://www.amazon.com/Although-Course-You-Becoming-
Yourself/dp/030759...
</value>
</annotation>
</annotations>

If we went with XML option #2 it may or may not be a problem that it
isn't
"symmetrical" with the JSON representation. On the other hand, JSON
and XML
tend to be culturally at opposite sides of the Pithiness Spectrum.

* Can I add annotations to a tweet after the tweet has been created?

No. Like the text of a tweet, its annotations are also immutable. They
can
only be specified when the tweet they are being attached to is
created. For
talking purposes, though, if you want to add annotations to a tweet
after
the fact, you could retweet the original tweet and attach annotations
to the
retweet.

* Ok, great. What should I use annotations for though?

We don't know! That's the cool thing. Annotations are a blank slate
that
lend themselves to myriad divergent use cases. We want to provide open-
ended
utility for all the developers to innovate on top of. Some of us have
initial ideas of cool potential uses cases that I'm sure we'll start
to
share just to seed the conversation as we get closer to launch.
Developers
will experiment with annotations. Certain ideas and approaches will
catch
on. Certain annotations will become standards democratically because
everyone agrees. Some might have diverging opinions. It's something
that we
hope will grow organically and be driven by sociological and cultural
forces.

* Ok, great. How are we going to figure out what Joe Random's
annotations
actually mean?

That's something we need to figure out as a community. But here is an
early
idea: People could add some agreed upon "meta-annotation" that points
to
something which *describes* the annotation or annotations that person
is
using. Think something sort of like XML DTD, though not necessarily
machine
readable. This meta annotation could point to a URL that simply has an
HTML
document that gives a description with some examples of the various
annotations you're experimenting with or standardizing on.

* Will it be in search? Streaming? Mobile? My toaster?

We hope so! When we launch you will at minimum be able to attach
annotations
to a tweet and consume annotations from a tweet's payload via the REST
API.
Of course it would be awesome to be able to say to search or the
streaming
API, "give me all tweets with this namespace", or "give me all tweets
with
this namespace and key", or etc. We're working with the Search,
Streaming
and other teams to make all this happen. We can't promise it'll be
ready by
launch but we know it's killer and a must have and are trying to get
it
ready soon.

* When is it going to launch?

This is, pretty much, the only thing a couple of us are going to be
working
on until it's launched. We really can't wait to get it in your hands
to see
all the cool things you'll do with it, so we're cranking to get it out
as
soon as possible. If I had to provide a guestimate, I'd wave my hands
in the
direction of 2 months for a early, incremental roll out. We not only
need to
implement all the functionality, but we also need to productionize it
in a
measured and responsible way to ensure its quality of service is high.

In closing:

We're really excited about Annotations. Annotations mark one of our
first of ...

read more »

--
Subscription settings: http://groups.google.com/group/crisiscamppdx/subscribe?hl=en

M. Edward (Ed) Borasky

unread,

Apr 18, 2010, 10:57:48 PM4/18/10

to twitter-deve...@googlegroups.com, crisis...@googlegroups.com, twitte...@googlegroups.com, tweak-t...@googlegroups.com, crisis...@googlegroups.com, swift...@googlegroups.com

On 04/16/2010 10:54 AM, Marcel Molina wrote:
> Hey everyone. One of the things we talked about at Chirp is the new
> Annotations feature we're working on. In short, it allows you to annotate a
> tweet with structured metadata. We're still working on Annotations, but I
> wanted to share with a wider audience beyond those I was able to talk to in
> person at Chirp about how we're thinking of doing Annotations.

[snip]

Marshall Kirkpatrick, a noted Portland tech reporter, RSS geek and
ReadWriteWeb lead writer, has posted a request for discussion of this
topic on his blog at

http://marshallk.com/help-me-articulate-the-potential-of-twitters-annotations

Consider yourselves invited to join in. ;-)

Reply all

Reply to author

Forward

0 new messages