An odd request for Twitter - Please stop fixing bugs in the API

4 views
Skip to first unread message

Kee Hinckley

unread,
Jun 10, 2008, 2:34:59 PM6/10/08
to twitter-deve...@googlegroups.com
I'm not saying the API doesn't need bugs fixes, not to mention enhancements--quite the contrary.  But I'm going to put on my CTO hat here and recommend that it's time for Twitter to take a deep breath, put the Web2.0 dev model on the shelf, and realize that you're past the "get some customers and financing" stage, and it's time to apply more structure to things. The existing model is great when the #1 goal is encourage third-party developers to get on board and to keep them happy. But right now the things that will make developers happy have less to do with speed of turnaround, and more to do with a consistent architecture and a reliable system. That requires a different development model.

Please don't read this as a criticism of the Twitter developers either. You guys have done a fantastic job. All I'm saying is that companies go through phases as they grow, and the transitions are a bear. A lot of company failures happen at those transition points (been there, done that). It's hard on developers as well, as new processes and structures make the system less efficient. But you've got to do it.

Specifically, I recommend two things.

1. Process-Process-Process

I'm sure you have some of this, but it's the decision and release stuff that is critical now.

a. Formal bug report/enhancement request system
I wouldn't replace the discussion list, if you have the resources it's far better to let people discuss and then at some point in the discussion wrap it up with "I'm going to file this as a bug/enhancement request with the following text, ..., does that sound right?    More work on your part, but better quality entries in the system, and you'll get some things that people wouldn't otherwise go to the trouble of posting.  Not to say you can't have people submit directly, and certainly comment on open bugs,  just that I find the discussion before-hand can be helpful. 

b. Internal Prioritization Process
Right now the simple stuff gets done immediately (which often means with bugs), and the complicated stuff gets queued. Simple stuff needs to go through a bit more review. The simple solution isn't always the correct one. I don't want to start a tangental argument, so let's not go into details on this thread, but here are two recent examples that I think should have had an architectural review before release.
- rate limit call
- should it return the current rate limit, the calls left, or both?
- what should it return for accounts with no limit?
- should it indicate the time left before reset?
- friendships exists call
- is a specific call a good idea? Why not give two users and get back none/followed/follower/both?
- is this a good idea at all? A comparison of friend lists would be more work for the client, but
  far less work for Twitter than having clients making this call over and over.
As I said, let's not start a discussion on these particular examples in this thread--I'm merely pointing out that there are valid questions that should have been reviewed--both with developers and internally at Twitter, prior to making any changes, no matter how trivial they are to implement.

c. Release Process
Come up with one. At a minimum I think it should include:
- private (selected group of alpha testers) review process for proposed API changes
- public review process for proposed API changes
- private alpha test (alternative server and/or restricted to particular accounts/apps(1))
- public beta test
- release
Definitely consider an alpha-test list.  The signal-to-noise ratio on this list is good, but you'll get more focused feedback and better participation from a closed list that gets first pass on new stuff. Brett Simmons, the developer of NetNewsWire, has an alpha-test mailing list that I salivate over as a developer. Not only do they really test stuff, but they make very concrete development and design suggestions, and have very active debates over features and changes. The right set of alpha testers, with the right mix of developers, designers and technical users can be a huge win, giving you benefits you'd never otherwise get access to.

d. Test Process
Every change, every bug patched, every enhancement made, needs to end up in a test suite. And that suite needs to be run prior to every software deployment, no matter how trivial.  I would also encourage API library writers to build test suites, and for you to provide a sandbox for libraries and apps they can use for testing. If you provide the framework, then library authors can give you their test suites to augment your own. That makes if far less likely that some minor change (e.g. < to ≤) will slip out and unexpectedly break applications.


2. Re-Architect the API
Yep, from scratch. We all know that Twitter wasn't intended to be a messaging system, and the API wasn't written with that in mind either. The data is in on how people are using Twitter. Now that information needs to be applied to the API. I'm betting that a new API, written with current developer goals (which is not the same as current developer usage) would go a long way to removing the need for rate limiting and paging.

a. Address Consistency Issues (which can have far worse implications than just annoyance)
A few examples.
- The calls have inconsistent names. 
- Arguments with the same name ("id" is the worst) mean different things in different places.
- The API is deliberately loose ("id can be either a screenname or an id") when there is no need for it to be, and when doing so only runs the risk of introducing bugs and security risks. DWIM is fine for users, bad idea for APIs.(2)
- Return values are inconsistent (e.g. the element names in a direct message are different than those in a normal message, even though the values are the same the the fields directly correspond to each other).

b. Figure out the goal, and redo it from scratch.
The old API can stay in place for a while, but a new one should be designed without worrying about compatibility. And this one can be designed (with input from the developer community) to specifically address the needs of the developers, but without putting undue load on Twitter.

Again, a particular example.

We've just had a discussion about the use of "since".  People were using it to get the most recent tweets since the last call, and it turns out that it's not a good way to do that (and furthermore, it's probably not a good query on the Twitter side either, unless they've indexed the date field, and the database is able to optimize <> comparisons on indexed dates). since_id is a better way to pick up messages since the last call (so long as you track it per-call, not just per-user). Does that mean that "since" can go away?  No.  Even if Tweet fetching switches to PubSub, applications are going to need to pick up a backlog of tweets. Never mind what happens when there is a server glitch (app developer or Twitter). "since" is the perfect way to fill in the gaps. In fact, what that call  might really want is the ability to take a date range, and also a "before_id". Assuming the call is used and not abused (registered apps only, for instance), that would actually have less of a load on Twitter than the current multiple-pages, rate-limited model. 

So there's an example of how stepping back from the development process, stepping back from the "I need this right now" requests, and looking at the big picture and the real question ("Never mind the API, what is you actually want to do?"), can benefit everyone.

One more simple example.

Right now, to get all of the messages that a user can see on their twitter page, I need to call:
account/archive
direct_messages
statuses/replies
direct_messages/sent
statuses/user_timeline
Never mind that I have to juggle all of that within the rate limits. It's a set of arbitrary distinctions that I really don't care about. Yes, "direct" messages are somewhat different, but otherwise the distinctions are ones impact how I display the messages; they should impact how I fetch them.  A simple "class" or "type" attribute on the tweets to distinguish between "update, direct and reply" would tell me everything I needed to know.  What I really want is one single call that simply gets me everything that's happened since my last call.  No paging, no since, not even a since_id. Simply a matter of "here's that opaque handle you gave me last time, give me everything since then". One call, all the data. If my app could make that call, and didn't have to worry about very quickly making it again so as not to lose any information, then I'd probably be making a lot fewer calls to Twitter. The net result would be a lower system load on Twitter, and less of a need for rate limiting and paging.

To make those decisions, you first need to know what people really are trying to do, and engage the developers in a discussion about what they want. Then you need to go back, hash it out, figure out what you can and can't do, and come back and circulate a proposal (see "Process Process Process" up above). When you're all done, everyone is going to be happier; users, developers, and especially your IT staff :-).


So, to get back to my Subject line...

I would like Twitter to stop fixing bugs in the API, stop enhancing the API, and stop even discussing enhancements to the API. Instead, I think you should lock yourself in a secret offsite location somewhere, possibly with some invited third-party developers, turn off the phones, and hash out two documents--one defining the process (including decision-making) to be followed from bug-report/enhancement-request through final release, the other defining an architecture for third party access to the system. In the meantime, you can station a few third party developers at the virtual door on the mailing list, and anytime someone posts a request for some feature, we'll tell them to take a hike--you're doing more important things.


Footnotes

1. A completely random aside--at some point you're going to want to use a crypto-key for application identification. Once you do that you can get rid of restrictions like "one page at a time" and simply allow multi-page (or page-less) fetching for registered applications only.


2. WRT allowing multiple interpretations of the "id" field, take for example, the user I just created with the screenname "2084821".

% ./twitter -v -u xxx -o raw=1 -o format=xml relationship_exists user_a=nazgul user_b=2084821
                     Tue, 10 Jun 2008 16:51:08 GMT, 200, OK
                     9.871835 seconds
<friends>false</friends>

Odd, nazgul and 2084821 are friends, why does it say false?

Let's do a show_user on 2084821.

% ./twitter -v -u xxx -o raw=1 -o format=xml show_user id=2084821
           Tue, 10 Jun 2008 16:52:20 GMT, 200, OK
           11.427647 seconds
<?xml version="1.0" encoding="UTF-8"?>
<user>
  <id>2084821</id>
  <name>Kee Hinckley</name>
  <screen_name>nazgul</screen_name>

Oops, that's not the user '2084821', that's the user 'nazgul'--who just happens to have the *ID* 2084821.  Or in other words, it is impossible to do any API-based queries on any user whose screenname "happens" to be the ID of an existing user. Another useful tool for spamming, if nothing else. The only good news is that the database id takes precedence. At least it's not possible to mask out an existing user that way!

Jesse Stay

unread,
Jun 10, 2008, 3:57:33 PM6/10/08
to twitter-deve...@googlegroups.com
Well put Kee - you've summarized about everything I've been thinking for awhile.  Again, not a criticism of the Twitter developers, but more a suggestion to get things back in order.  Might I also suggest a bug tracking system like Bugzilla or similar to put bugs away for awhile until you have time to fix them?  This will also allow us as developers to know the status without having to "bug" you on the mailing list.

--Jesse

Dossy Shiobara

unread,
Jun 10, 2008, 4:02:05 PM6/10/08
to twitter-deve...@googlegroups.com
"Make it work, then make it work better."

Lately, things haven't been "working" - can we get back there, first?

:-) ha ha, only serious.

--
Dossy Shiobara | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network | http://panoptic.com/
"He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)

Alex Payne

unread,
Jun 10, 2008, 4:11:29 PM6/10/08
to twitter-deve...@googlegroups.com
I actually really like the idea of a public bug tracker for the API,
something akin to what the Google App Engine team uses with a voting
system to prioritize fixes. I keep all pending API fixes in our
internal bug tracker, but I have no problem exposing that list
publicly. Does anybody know of a hosted solution for a simple bug
tracker with voting features?

Generally, I do plan to re-architect the API along with our backend
architectural changes. I've got an eye towards exactly the issues
you've mentioned regarding consistency and the practicality of some
API options.

I'm also in favor of many of the process-oriented changes you suggest.
Thanks for taking the time to write all that up!

--
Alex Payne
http://twitter.com/al3x

Jacques Marneweck

unread,
Jun 10, 2008, 4:33:13 PM6/10/08
to Twitter Development Talk
Hi Alex,

You can always use the php.net bug tracking system which allows users
to vote that a feature affects them as well.

Regards
--jm
Reply all
Reply to author
Forward
0 new messages