Twitter Update, 8/10 noon PST

11 views
Skip to first unread message

Ryan Sarver

unread,
Aug 10, 2009, 2:57:52 PM8/10/09
to twitter-deve...@googlegroups.com
Wanted to send out a status update and let everyone know where the situation stands as of today at noon.

- Most developers are reporting being back in operation as of noon on Sunday
- We have changed our defenses to make sure API developers are better supported. As such the system has more general strain on it and thus will produce some more 502/503 errors. If you see them, you should do a geometric back off instead of just sending a new request.
- OAuth should be fully operational 
- If you continue to have unexpected errors, please produce a packet trace so we can help debug and define the issue.

I will continue to give periodic updates throughout the day as we know more, but as most apps are back in action the updates will be more based on new news. Please continue to let us know of any unexpected issues you may have.

Thanks again for your continued patience and support.

Best, Ryan

Dewald Pretorius

unread,
Aug 10, 2009, 3:36:02 PM8/10/09
to Twitter Development Talk
On Aug 10, 3:57 pm, Ryan Sarver <rsar...@twitter.com> wrote:
> As such the system has more general strain on it and thus will
> produce some more 502/503 errors. If you see them, you should do a geometric
> back off instead of just sending a new request.

Ryan,

What starting value and what common ratio of a geometric back off
would you recommend?

Dewald

Michael Chang

unread,
Aug 10, 2009, 8:54:45 PM8/10/09
to twitter-deve...@googlegroups.com

One issue with back off (geometric or otherwise) is that if everyone uses the same values; it won't work.

Think about it -- let's say 10 000 users all access the system simultaneously and all of them get 502/503 errors. Then let's say they all wait five seconds before retrying. Once those five seconds are up; they will all simultaneously accesss the site again, and likely again get the same 502/503 errors. This causes them all to back off again, say, for 25 seconds. Then they will all again contact the server again, at the same time, and so on and so forth until either they all give up, or until the end of time, whichever comes first.

(Yes, this is a simplified example, but it should get the point across. In practice, at least a few users might get through every time, and eventually, yes, everyone would get served if they are patient enough. But if everyone uses different back-off values, then the traffic becomes somewhat more even, and thus the servers can cope with the load more easily.)

--
Thanks,

Michael Chang

I may not be able to open heavily-formatted Word, Powerpoint, or Excel documents. Send at your own risk.

jim.renkel

unread,
Aug 10, 2009, 10:18:04 PM8/10/09
to Twitter Development Talk
Yup, when you do back-offs, ya can't do them deterministically, ya
gotta do them for a random amount, generally uniformly distributed
between some upper and lower bounds.

It's the bounds that increase geometrically or exponentially, up to
some limit, but the each back-off should be random between the bounds.

If the back-offs are not randomized, its leads to synchronicity, as
you noted.

BTW, all standardized back-offs of which I am aware specify randomized
back-off.

Jim Renkel

hansamann

unread,
Aug 11, 2009, 1:01:16 AM8/11/09
to Twitter Development Talk
Can someone post a link to some online resources explaining more about
geometric back-offs? Did a search, did not find a whole lot.

Thx
Sven

Chris Babcock

unread,
Aug 11, 2009, 1:39:21 AM8/11/09
to twitter-deve...@googlegroups.com
On Mon, 10 Aug 2009 22:01:16 -0700 (PDT)
hansamann <sven....@googlemail.com> wrote:

> Can someone post a link to some online resources explaining more about
> geometric back-offs? Did a search, did not find a whole lot.

Retry intervals grow in a geometric progression:

http://en.wikipedia.org/wiki/Geometric_progression

A start value of 1 second that doubles on each subsequent retry is
common, as are caps on the length of time to continue attempts.

Chris Babcock


Naveen Ayyagari

unread,
Aug 11, 2009, 2:27:27 AM8/11/09
to twitter-deve...@googlegroups.com

Just wanted to report that we are back up and running for the most
part as well, BUT quite a number of our servers are still experiencing
some BlackOut periods where twitter fails to respond and connections
time out. They seem to last about 5-10 minutes each. We are running
quite a few servers during this time to mitigate the issue propagating
to our users, but it is a costly proposition to continue running this
many servers for extended periods of time.

Here is a traceroute to twitter.com on one of the servers during the
"Blackout" when it is not getting any response from twitter

traceroute to twitter.com (168.143.162.100), 30 hops max, 40 byte
packets
1 67-207-128-2.slicehost.net (67.207.128.2) 0.000 ms 0.000 ms
0.000 ms
2 209-20-79-2.slicehost.net (209.20.79.2) 0.000 ms 0.000 ms
0.000 ms
3 ge-6-10-193.car1.StLouis1.Level3.net (4.53.160.189) 0.000 ms
0.000 ms 0.000 ms
4 ae-11-11.car2.StLouis1.Level3.net (4.69.132.186) 0.000 ms 0.000
ms 0.000 ms
5 ae-4-4.ebr2.Chicago1.Level3.net (4.69.132.190) 7.999 ms 7.999
ms 7.999 ms
6 ae-2-54.edge3.Chicago3.Level3.net (4.68.101.116) 7.999 ms
ae-2-52.edge3.Chicago3.Level3.net (4.68.101.52) 7.999 ms
ae-2-54.edge3.Chicago3.Level3.net (4.68.101.116) 7.999 ms
7 4.68.63.198 (4.68.63.198) 7.999 ms 7.999 ms 8.000 ms
8 ae-1.r21.chcgil09.us.bb.gin.ntt.net (129.250.3.8) 8.000 ms
8.000 ms 8.000 ms
9 as-5.r20.snjsca04.us.bb.gin.ntt.net (129.250.3.77) 51.996 ms
51.996 ms 51.996 ms
10 xe-1-3.r02.mlpsca01.us.bb.gin.ntt.net (129.250.5.61) 55.995 ms
55.995 ms 55.995 ms
11 mg-1.c00.mlpsca01.us.da.verio.net (129.250.24.202) 55.995 ms
55.995 ms 59.995 ms
12 128.121.150.245 (128.121.150.245) 55.995 ms 51.995 ms 51.995 ms
13 128.121.150.245 (128.121.150.245) 51.996 ms !X * *

jim.renkel

unread,
Aug 11, 2009, 9:09:10 PM8/11/09
to Twitter Development Talk
Geometric backoffs are more generally know as exponential backoffs. If
ya google that, ya get a couple of useful and interesting things:

http://en.wikipedia.org/wiki/Exponential_backoff
http://en.wikipedia.org/wiki/Truncated_binary_exponential_backoff
http://dthain.blogspot.com/2009/02/exponential-backoff-in-distributed.html
etc.

Hope this helps.

Jim

On Aug 11, 12:01 am, hansamann <sven.hai...@googlemail.com> wrote:
> Can someone post a link to some online resources explaining more aboutgeometricback-offs? Did a search, did not find a whole lot.
>
> Thx
> Sven
>
> On Aug 10, 7:18 pm, "jim.renkel" <james.ren...@gmail.com> wrote:
>
> > Yup, when you doback-offs, ya can't do them deterministically, ya
> > gotta do them for a random amount, generally uniformly distributed
> > between some upper and lower bounds.
>
> > It's the bounds that increase geometrically or exponentially, up to
> > some limit, but the each back-off should be random between the bounds.
>
> > If theback-offsare not randomized, its leads to synchronicity, as
> > you noted.
>
> > BTW, all standardizedback-offsof which I am aware specify randomized
> > back-off.
>
> > Jim Renkel
>
> > On Aug 10, 7:54 pm, Michael Chang <thenewm...@gmail.com> wrote:
>
> > > On Mon, Aug 10, 2009 at 3:36 PM, Dewald Pretorius <dpr...@gmail.com> wrote:
>
> > > > On Aug 10, 3:57 pm, Ryan Sarver <rsar...@twitter.com> wrote:
> > > > > As such the system has more general strain on it and thus will
> > > > > produce some more 502/503 errors. If you see them, you should do a
> > > >geometric
> > > > > back off instead of just sending a new request.
>
> > > > Ryan,
>
> > > > What starting value and what common ratio of ageometricback off
> > > > would you recommend?
>
> > > One issue with back off (geometricor otherwise) is that if everyone uses
Reply all
Reply to author
Forward
0 new messages