Status IDs are changing on 21st September

Matt Harris

unread,

Aug 23, 2010, 6:45:38 PM8/23/10

to twitter-deve...@googlegroups.com, twitter-ap...@googlegroups.com

Hey Developers!

A while ago we let you know about the new Tweet ID generation service
we developed called Snowflake and published the source code so you
could get familiar with how it works. Today, we're announcing that at
10am PDT on Tuesday September 21st, 2010 Snowflake will be in use on
our production systems and that status IDs will no longer be
sequential.

Snowflake still uses 64-bit unsigned integers but instead of being
sequential they will instead be based on time and composed of: a
timestamp, a worker number and a sequence number. For the majority of
you this change will go unnoticed and your applications will continue
to function without the need for any changes. In addition the API is
ready for Snowflake and parameters such as max_id and since_id will
work as expected. Snowflake does mean Tweet IDs will no longer be
useful for data analysis, and things like counting Tweets by
subtracting status IDs will not be possible.

We listened when you told us about sorting Tweets by ID and knew that
we needed to keep the ID roughly sortable. With Snowflake if two
Tweets are posted within 1 second of each other they will be within a
second of each other in the ID space too. This means although Tweets
will no longer be sorted, they will be k-sorted to approximately 1
second.

The key points:
* Status IDs will be unique
* Status IDs will continue to increase - Tweets created later in the
day will have a higher ID that those created in the morning
* Order will be maintained for Tweets allowing you to sort by Status
ID. The accuracy of the sort will be to approximately 1 second,
meaning Tweets created within a second of each other have no order.
* All existing API methods will continue to work the same as before
* Previous status IDs will be unchanged
* There will be a noticeable jump in the numerical value of status IDs
when we change.

You can read more about Snowflake on the Twitter Engineering blog:
http://bit.ly/announcing-snowflake

Best

Matt Harris
Developer Advocate, Twitter
http://twitter.com/themattharris

D. Smith

unread,

Aug 23, 2010, 11:03:55 PM8/23/10

to Twitter Development Talk

Another one hits the MySQL brick wall.
I'm surprised someone with as much data as you have have managed to
stay with MySQL for as long as you have.
I must have been a real pain to constantly fight the loosing MySQL
optimization battle.

It would be very interesting to know what made you choose Cassandra
over other NoSQL solutions.
I hope you will post a nice blog post about this, why you chose
Cassandra? What alternative have you considered?

John Kalucki

unread,

Aug 23, 2010, 11:17:16 PM8/23/10

to twitter-deve...@googlegroups.com

We're not using Cassandra to store tweets just yet. See: http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html

I don't think we've announced our approach for tweet storage as yet.

-John Kalucki

http://twitter.com/jkalucki

Twitter, Inc.

D. Smith

unread,

Aug 25, 2010, 9:59:10 PM8/25/10

to Twitter Development Talk

Ok, so what column type show we make the status_id now in MySQL?

By the way, you are not planning to also change format or user_id, are
you?

On Aug 23, 11:17 pm, John Kalucki <j...@twitter.com> wrote:
> We're not using Cassandra to store tweets just yet. See:http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html
>
> I don't think we've announced our approach for tweet storage as yet.
>

> -John Kaluckihttp://twitter.com/jkalucki

John Kalucki

unread,

Aug 25, 2010, 10:08:01 PM8/25/10

to twitter-deve...@googlegroups.com

It'll still be a long int.

I don't know what format is. User_id generation will, someday, be similar.

-John Kalucki

http://twitter.com/jkalucki

Twitter, Inc.

--
Twitter developer documentation and resources: http://dev.twitter.com/doc

API updates via Twitter: http://twitter.com/twitterapi

Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: http://groups.google.com/group/twitter-development-talk?hl=en

zn...@borasky-research.net

unread,

Aug 25, 2010, 10:30:19 PM8/25/10

to John Kalucki, twitter-deve...@googlegroups.com

2**64 - 1 users? Can you say "Internet of things?" ;-)

Sent from my Verizon Wireless Phone

Spode

unread,

Aug 26, 2010, 6:58:19 AM8/26/10

to Twitter Development Talk

I must admit, I'm a little confused on some aspects of this.

Will the new IDs continue on from the old IDs sequentally? Or will
they be completely incompatible with the old IDs?

I have a database of several million tweets that JournoTwit users use.
I don't want to have to start differentiating between two ID types and/
or having to completely clear the database out.

What I haven't seen amongst any of this documentation - is an example
of the new status ids in comparison to the old? That would probably
answer a few questions :D And I assume - Direct Messages will be
undergoing the same transformation?

Spode

Matt Harris

unread,

Aug 26, 2010, 9:51:32 PM8/26/10

to twitter-deve...@googlegroups.com

Hi Spode,

Thanks for your questions, i've answered them inline.

> Will the new IDs continue on from the old IDs sequentally? Or will
> they be completely incompatible with the old IDs?

All existing IDs will stay the same. The new IDs will be greater than
the old ones and there is likely to be a gap between the old and new.
I'm not sure what you mean by being incompatible though - an ID is a
unique identifier for an object and the new IDs will continue to be
unique identifiers.

> I have a database of several million tweets that JournoTwit users use.
> I don't want to have to start differentiating between two ID types and/
> or having to completely clear the database out.

You won't have to clear your database out or change the the datatype.
The new status IDs are still 64bit integers, with newer Tweets having
numerically higher IDs.

> What I haven't seen amongst any of this documentation - is an example
> of the new status ids in comparison to the old? That would probably
> answer a few questions :D And I assume - Direct Messages will be
> undergoing the same transformation?

The new status IDs apply to Tweets, ReTweets and Mentions (so
basically anything that can show up in the home timeline).
The code that generates the IDs has been shared and you can read more
about Snowflake from our engineering team on their blog.
http://bit.ly/announcing-snowflake