Search API to require HTTP Referrer and/or User Agent

16 views
Skip to first unread message

Doug Williams

unread,
Jun 16, 2009, 12:33:33 PM6/16/09
to twitter-ap...@googlegroups.com, Twitter Development Talk
Hi all,
The Search API will begin to require a valid HTTP Referrer, or at the very least, a meaningful and unique user agent with each request. Any request not including this information will be returned a 403 Forbidden response code by our web server.

This change will be effective within the next few days, so please check your applications using the Search API and make any necessary code changes.

Thanks,
Doug

Justyn Howard

unread,
Jun 16, 2009, 12:51:37 PM6/16/09
to twitter-deve...@googlegroups.com
Thanks Doug - Any additional info to help us know if we comply? My dev is out of the country on vacation and want to make sure we don’t miss anything.

Doug Williams

unread,
Jun 16, 2009, 12:56:43 PM6/16/09
to twitter-deve...@googlegroups.com
All we ask is that you include a valid HTTP Referrer and/or a User Agent with each request which is easy to do in almost every language. Both would be helpful but we only require one at this time. We simply want to be able to identify apps and have the ability to communicate with the authors.

Thanks,
Doug

funkatron

unread,
Jun 16, 2009, 12:57:37 PM6/16/09
to Twitter Development Talk
Indeed, some clearer criteria would be most appreciated.

--
Ed Finkler
http://funkatron.com
Twitter:@funkatron
AIM: funka7ron
ICQ: 3922133
XMPP:funk...@gmail.com


On Jun 16, 12:51 pm, Justyn Howard <justyn.how...@gmail.com> wrote:
> Thanks Doug - Any additional info to help us know if we comply? My dev is
> out of the country on vacation and want to make sure we don¹t miss anything.
>

Justyn Howard

unread,
Jun 16, 2009, 12:59:45 PM6/16/09
to twitter-deve...@googlegroups.com
Thanks, pretty sure we do both. Will this new (or newly enforced) policy help clean up some garbage?

Naveen Kohli

unread,
Jun 16, 2009, 12:58:17 PM6/16/09
to twitter-deve...@googlegroups.com
Why would you make decision based on "Referrer" which is an OPTIONAL header field in HTTP protocol? Making decision based on something that is "REQUIRED" may be more appropriate.
--
Naveen K Kohli
http://www.netomatix.com

Stuart

unread,
Jun 16, 2009, 1:04:20 PM6/16/09
to twitter-deve...@googlegroups.com
The logical thing would be to set the referrer to the domain name of
your application. If it doesn't have one I'd say use your Twitter user
URL (i.e. http://twitter.com/stut).

Most HTTP libs in most languages will set a default user agent, and
it's usually pretty easy to override it. I'd suggest appname/0.1 where
appname is something that identifies your app and is a valid user
agent - Google can help you there. I doubt the version number is
important to anyone but you.

-Stuart

--
http://stut.net/projects/twitter

2009/6/16 funkatron <funk...@gmail.com>:

Stuart

unread,
Jun 16, 2009, 1:05:46 PM6/16/09
to twitter-deve...@googlegroups.com
It's optional in the HTTP spec, but mandatory for the Twitter Search
API. I don't see a problem with that.

Doug: Presumably the body of the 403 response will contain a suitable
descriptive error message in the usual format?

-Stuart

--
http://stut.net/projects/twitter

2009/6/16 Naveen Kohli <navee...@gmail.com>:

Chad Etzel

unread,
Jun 16, 2009, 1:10:29 PM6/16/09
to twitter-deve...@googlegroups.com
On Tue, Jun 16, 2009 at 1:05 PM, Stuart<stu...@gmail.com> wrote:
>
> It's optional in the HTTP spec, but mandatory for the Twitter Search
> API. I don't see a problem with that.

Erm, for sites like TweetGrid, TweetChat, etc, which are all
browser-based client-side driven sites, the users' browser will make
the request. In this case the HTTP Referrer can be (and often is)
unset. The user-agent, however, is usually set for all browsers, but
sometimes people use plugins to mask or delete that, even. Just and
FYI that not all of us have control over this.

-Chad

funkatron

unread,
Jun 16, 2009, 1:10:47 PM6/16/09
to Twitter Development Talk
Totally understand the need. I asked for clearer criteria because in
message one, you state you'll require

"a valid HTTP Referrer" or "a meaningful and unique user agent"

I can probably define a valid HTTP Referrer as containing a URL that
exists, but a meaningful/unique user agent is somewhat in the eye of
the beholder. In the second message, you say you'll require

"a valid HTTP Referrer and/or a User Agent"

I'm not sure how to define a "valid" user agent. That's why I'd like
to see *your* definition for these things so we can meet your
criteria.

--
Ed Finkler
http://funkatron.com
Twitter:@funkatron
AIM: funka7ron
ICQ: 3922133
XMPP:funk...@gmail.com


On Jun 16, 12:56 pm, Doug Williams <d...@twitter.com> wrote:
> All we ask is that you include a valid HTTP Referrer and/or a User Agent
> with each request which is easy to do in almost every language. Both would
> be helpful but we only require one at this time. We simply want to be able
> to identify apps and have the ability to communicate with the authors.
>
> Thanks,
> Doug
>
> On Tue, Jun 16, 2009 at 9:51 AM, Justyn Howard <justyn.how...@gmail.com>wrote:
>
> >  Thanks Doug - Any additional info to help us know if we comply? My dev is
> > out of the country on vacation and want to make sure we don’t miss anything.
>

Matt Sanford

unread,
Jun 16, 2009, 1:25:15 PM6/16/09
to twitter-deve...@googlegroups.com
Hi all,

Let me clarify a bit. For server-side processing please set the
User-Agent header. I recommend using your domain name, or if you don't
have one (which is odd) your appname. Something like "myapp.com" or
"myapp". By using domain name we'll be able to check out the site and
reach out to contact people if we suspect them of abuse. Spammers
often don't respond to questions from the services they abuse, and if
someone is using your user-agent falsely you'll have the possibility
of saying "That's not me, I'm not on app engine". For client-side
processing like TweetGrid the browser will send a User-Agent and
referrer unless you're doing something exceedingly odd, so you should
be fine.

This change is mostly to combat an increasing amount of spam
coming from "cloud" services like ecs and appengine. At first we'll
only be applying this restriction to those IP addresses but it may
need to be broadened as time goes on. If you're writing client
software please add a user-agent in case we end up having to widen
this in the future. This seems like a better plan than the Media
Temple fiasco we went though last time we blocked a shared service for
hosting spammers [1].

Thanks;
– Matt Sanford / @mzsanford
Twitter Dev

[1] - https://twitter.com/mzsanford/status/1924718435

Brooks Bennett

unread,
Jun 16, 2009, 1:46:48 PM6/16/09
to Twitter Development Talk
I checked and TweetGrid was setting a referrer (on the page I tested,
it was http://tweetgrid.com/grid?l=0), and as Matt said all should be
fine for us Client-side Search API peeps.

Brooks

On Jun 16, 12:10 pm, Chad Etzel <jazzyc...@gmail.com> wrote:

Brooks Bennett

unread,
Jun 16, 2009, 1:22:35 PM6/16/09
to Twitter Development Talk
Thanks for chiming in on this Chad!

On Jun 16, 12:10 pm, Chad Etzel <jazzyc...@gmail.com> wrote:

Stuart

unread,
Jun 16, 2009, 2:33:08 PM6/16/09
to twitter-deve...@googlegroups.com
2009/6/16 Chad Etzel <jazz...@gmail.com>
Where a request is made from one page to another, even if it's via JS most browsers will set the referrer to the current URL.

Besides I wasn't claiming that it wouldn't be an issue for anyone, I was just commenting on the fact that just because it's optional in HTTP doesn't in any way stop Twitter from making it mandatory for their APIs. That's the only point I was trying to make. It's like saying "my car doesn't make me wear a seatbelt so neither can you".

-Stuart

-- 

burton

unread,
Jun 16, 2009, 1:54:37 PM6/16/09
to Twitter Development Talk
Hey guys.

This has already been banged out in the RSS wars (of which I'm a
veteran and have the battle scars).

Don't use a Referrer unless it's literally a page with a link or
search page.

You should use a User-Agent here (which is what it is designed for).

The browser should generally send the Referrer ......

We send a User-Agent....

Kevin

On Jun 16, 10:04 am, Stuart <stut...@gmail.com> wrote:
> The logical thing would be to set the referrer to the domain name of
> your application. If it doesn't have one I'd say use your Twitter user
> URL (i.e.http://twitter.com/stut).
>
> Most HTTP libs in most languages will set a default user agent, and
> it's usually pretty easy to override it. I'd suggest appname/0.1 where
> appname is something that identifies your app and is a valid user
> agent - Google can help you there. I doubt the version number is
> important to anyone but you.
>
> -Stuart
>
> --http://stut.net/projects/twitter
>
> 2009/6/16 funkatron <funkat...@gmail.com>:
>
>
>
> > Indeed, some clearer criteria would be most appreciated.
>
> > --
> > Ed Finkler
> >http://funkatron.com
> > Twitter:@funkatron
> > AIM: funka7ron
> > ICQ: 3922133
> > XMPP:funkat...@gmail.com

Naveen Kohli

unread,
Jun 16, 2009, 3:04:54 PM6/16/09
to twitter-deve...@googlegroups.com
Redefining HTTP spec, eh :-)
Whatever makes twitter boat float. Lets hope for the best. Just concerned that some firewalls or proxies tend to remove "referrer".


On Tue, Jun 16, 2009 at 1:05 PM, Stuart <stu...@gmail.com> wrote:

Stuart

unread,
Jun 16, 2009, 3:39:18 PM6/16/09
to twitter-deve...@googlegroups.com
2009/6/16 Naveen Kohli <navee...@gmail.com>

Redefining HTTP spec, eh :-)
Whatever makes twitter boat float. Lets hope for the best. Just concerned that some firewalls or proxies tend to remove "referrer".

What a completely ridiculous thing to say. It's not "redefining" anything. If Twitter want to require something in order to access their service they absolutely have that right. It's not like they're saying every HTTP server should start requiring these headers.

It's true that some firewalls and proxies remove the referrer header, and some also remove the user agent header.

I'm somewhat unclear on exactly how this stuff is supposed to help. If an application sets out to abuse the system they'll simply set the headers so they look like a normal browser. I don't see what purpose requiring these headers to be something useful will actually serve. IMHO you might as well "require" the source parameter for all API requests that use basic auth which is simple for all apps to implement; OAuth clearly carries identification with it already.

-Stuart

-- 

Michael Ivey

unread,
Jun 16, 2009, 4:42:07 PM6/16/09
to twitter-deve...@googlegroups.com
If the User-Agent/Referrer says "Twitpay", and it's really me, when Twitter contacts me, I'll answer, and we'll work it out.

If the User-Agent/Referrer says "Twitpay", and it's not really me, when Twitter contacts me, I'll tell them, and they'll block the IP.

It's a starting point for figuring things out, not an authorization scheme.

 -- ivey

Marco Kaiser

unread,
Jun 16, 2009, 4:49:21 PM6/16/09
to twitter-deve...@googlegroups.com
I agree with Stuart, this might be tricky for client applications that are running behind firewalls / proxies that might remove both header fields, and neither the app author nor the user might have any control over this. Finally, that means you'll lock out those people from using search in their preferred twitter apps.

Marco

2009/6/16 Stuart <stu...@gmail.com>

Jonas

unread,
Jun 16, 2009, 5:04:07 PM6/16/09
to Twitter Development Talk
How does one set the http referrer and user agent?

Matt Sanford

unread,
Jun 16, 2009, 5:05:21 PM6/16/09
to twitter-deve...@googlegroups.com
Hi there,

    While all of this flame is keeping my feet warm it's not really productive. This isn't Slashdot comments, let's try and remain on topic rather the getting into RFC debates. To be even more explicit than my previous email: Use the user-agent. Referrer will be taken care of by browsers and I see as a fallback for client-side JSON users rather than a replacement for a user-agent.

    The subsequent reply from Michael Ivey about how this helps is dead on. With no context at all I'm forced to block all of ECS/AppEngine/Yahoo Pipes is one person misbehaves. Nobody likes that. Since search is not authenticated OAuth does not really help here. We may be forced to make search authenticated if we can't find a reasonable way to sort the good from the bad. This is a first attempt at helping us cut out poorly build spam scripts and shorten the time I spend researching each abuser. It saves time and lets me fix more bugs, assuming I don't spend the newly saved time in RFC debates, that is :)
    
Thanks;
 – Matt Sanford / @mzsanford
     Twitter Dev

Chad Etzel

unread,
Jun 16, 2009, 5:34:31 PM6/16/09
to twitter-deve...@googlegroups.com
On Tue, Jun 16, 2009 at 5:05 PM, Matt Sanford<ma...@twitter.com> wrote:
> Hi there,
>     While all of this flame is keeping my feet warm it's not really productive.

Are you sure this is a flame war as defined by RFC 1855 [1]?

...sorry, had to :)
-Chad

[1] http://www.faqs.org/rfcs/rfc1855.html

Marco Kaiser

unread,
Jun 16, 2009, 5:36:49 PM6/16/09
to twitter-deve...@googlegroups.com
Matt,

far from getting into RFC debates, but really concerned for the non-server apps out there, which may not have full control over the network infrastructure they run on. If I set up my own server(s) at a data center, I sure can take care of sending you the right referrer and user-agent, but unfortunately that's not the case in many environments behind firewalls and / or proxies.

What's your point on that? I fully understand your intention and the need for getting some identification - so happy to discuss anything that'll also work through restricted network access.

Thanks,
Marco

2009/6/16 Matt Sanford <ma...@twitter.com>

Chad Etzel

unread,
Jun 16, 2009, 5:47:15 PM6/16/09
to twitter-deve...@googlegroups.com
Perhaps some sort of signature/app value in the URL request query
string? That will make it through proxies and firewalls, and is just
as easily spoofed as HTTP-Referrer and User-Agents...

-Chad

Doug Williams

unread,
Jun 16, 2009, 6:04:54 PM6/16/09
to twitter-deve...@googlegroups.com
As you have determined, we just a better way to track who is making requests and at what volume. If you are doing janky things and we don't know who you are (no referrer or user agent) then we have no contact for your application. We will block the IP address and move on.

However if you would like to give us a chance to work with you before terminating your access unexpectedly, please provide us with enough of a hint (through a HTTP Referrer and/or User Agent) to determine who you are so we can have any necessary conversations.

We do not feel that this is not an unreasonable request. Low volume clients will not be affected. Anyone doing anything that bubbles to the top of logs however may be subject to scrutiny.

Thanks,
Doug


--
Do you follow me? http://twitter.com/dougw

Marco Kaiser

unread,
Jun 16, 2009, 6:19:16 PM6/16/09
to twitter-deve...@googlegroups.com
You are still missing my point - desktop clients may not be able to send a User Agent or Referrer, based on the network infrastructure the use is locked into. Nothing in your repsonse addressed this issue.

I am fully willing to send the requested data in the clients (and I already do), but I have no means to make sure they reach you. So if they don't, even though I am doing all you ask me to do, you'll still lock out the user from search in his client. I am not worried to be blocked or whatever, it's merely that the requirement to provide one of the two HTTP headers may not be possible for client apps. So low volume clients (in terms of client-per-IP numbers, not overall) clearly WILL be affected.

Marco

2009/6/17 Doug Williams <do...@twitter.com>

Doug Williams

unread,
Jun 16, 2009, 8:31:11 PM6/16/09
to twitter-deve...@googlegroups.com
For most applications, enforcement of this requirement will be subject to manual review. We want a marker (Referrer and/or User Agent) to help understand who top searchers are when problems arise and if we can determine a better data access plan for their needs. End-users and clients never hit our trip-wires as they are not individually querying the API with enough frequently warrant a manual review. For your needs, Marco, a best effort to include a the requested data is sufficient on our end and will not cause any problems if the data is removed by network gear.

Services that are in cloud-based hosts, such as EC2 and AppEngine will however be subject to programmatic enforcement of this policy. Additionally, we reserve the right to add hosts to this if we find that a host is being used to exploit our service. This is to protect the service against abuse which often comes from shared hosts such as these.

Thanks,
Doug

Brian Gilham

unread,
Jun 16, 2009, 9:22:50 PM6/16/09
to twitter-deve...@googlegroups.com
R


From: Doug Williams
Date: Tue, 16 Jun 2009 17:31:11 -0700
To: <twitter-deve...@googlegroups.com>
Subject: [twitter-dev] Re: Search API to require HTTP Referrer and/or User Agent

Andrew Badera

unread,
Jun 17, 2009, 5:19:31 AM6/17/09
to twitter-deve...@googlegroups.com
I think you misspelled "Arrrrrrrrr," matey!

Marco Kaiser

unread,
Jun 17, 2009, 6:32:02 AM6/17/09
to twitter-deve...@googlegroups.com
Doug,

citing from your original mail:


"Any request not including this information will be returned a 403 Forbidden response code by our web server."

How does it map to what you say now, that "a best effort is sufficient", if you reject any request without those header(s) with a 403 response? Again, I am not fearing an IP or User-Agent ban because of not-sent header data; what I fear is a rejection of search requests when the header data is removed by network gear. At least that's how I read your announcement for this change - or am I wrong? Will you only reject requests for certain IPs that have high volume based on the Referrer/User-Agent requirement, but in general the Search API doesn't require it to be present?

Doug Williams

unread,
Jun 17, 2009, 12:40:30 PM6/17/09
to twitter-deve...@googlegroups.com
Marco,
I was giving us breathing room. In 6 days, we will require this data but enforcement will be manual in most cases. My strict language above is to ensure that developers know we reserve the right to terminate their applications without warning if they are abusing the system and not including this required information.

Thanks,
Doug

Craig Hockenberry

unread,
Jun 17, 2009, 1:09:02 PM6/17/09
to Twitter Development Talk
Matt & Doug,

Here's some more information to help fingerprint search requests:

The MGTwitterEngine library sends the following X headers by default:

X-Twitter-Client: MGTwitterEngine
X-Twitter-Client-Url: http://mattgemmell.com/source
X-Twitter-Client-Version: 1.0

These can be overridden by the developer. For Twitterrific, we're
using:

X-Twitter-Client: Twitterrific
X-Twitter-Client-Url: http://iconfactory.com/twitterrific
X-Twitter-Client-Version: iPhone 2.0

In addition, connections initiated from an iPhone will likely be going
through CFNetwork. This API adds a user agent that contains the
application name, version as well as the version of the framework and
operating system. For example:

User-Agent: Twitterrific/2.1a3 CFNetwork/445.6 Darwin/10.0.0d3

Hope this helps!

-ch


On Jun 16, 2:05 pm, Matt Sanford <m...@twitter.com> wrote:
> Hi there,
>
>      While all of this flame is keeping my feet warm it's not really  
> productive. This isn't Slashdot comments, let's try and remain on  
> topic rather the getting into RFC debates. To be even more explicit  
> than my previous email: Use the user-agent. Referrer will be taken  
> care of by browsers and I see as a fallback for client-side JSON users  
> rather than a replacement for a user-agent.
>
>      The subsequent reply from Michael Ivey about how this helps is  
> dead on. With no context at all I'm forced to block all of ECS/
> AppEngine/Yahoo Pipes is one person misbehaves. Nobody likes that.  
> Since search is not authenticated OAuth does not really help here. We  
> may be forced to make search authenticated if we can't find a  
> reasonable way to sort the good from the bad. This is a first attempt  
> at helping us cut out poorly build spam scripts and shorten the time I  
> spend researching each abuser. It saves time and lets me fix more  
> bugs, assuming I don't spend the newly saved time in RFC debates, that  
> is :)
>
> Thanks;
>   – Matt Sanford / @mzsanford
>       Twitter Dev
>
> On Jun 16, 2009, at 12:39 PM, Stuart wrote:
>
>
>
> > 2009/6/16 Naveen Kohli <naveenko...@gmail.com>
> > Redefining HTTP spec, eh :-)
> > Whatever makes twitter boat float. Lets hope for the best. Just  
> > concerned that some firewalls or proxies tend to remove "referrer".
>
> > What a completely ridiculous thing to say. It's not "redefining"  
> > anything. If Twitter want to require something in order to access  
> > their service they absolutely have that right. It's not like they're  
> > saying every HTTP server should start requiring these headers.
>
> > It's true that some firewalls and proxies remove the referrer  
> > header, and some also remove the user agent header.
>
> > I'm somewhat unclear on exactly how this stuff is supposed to help.  
> > If an application sets out to abuse the system they'll simply set  
> > the headers so they look like a normal browser. I don't see what  
> > purpose requiring these headers to be something useful will actually  
> > serve. IMHO you might as well "require" the source parameter for all  
> > API requests that use basic auth which is simple for all apps to  
> > implement; OAuth clearly carries identification with it already.
>
> > -Stuart
>
> > --
> >http://stut.net/projects/twitter
>
> > On Tue, Jun 16, 2009 at 1:05 PM, Stuart <stut...@gmail.com> wrote:
>
> > It's optional in the HTTP spec, but mandatory for the Twitter Search
> > API. I don't see a problem with that.
>
> > Doug: Presumably the body of the 403 response will contain a suitable
> > descriptive error message in the usual format?
>
> > -Stuart
>
> > --
> >http://stut.net/projects/twitter
>
> > 2009/6/16 Naveen Kohli <naveenko...@gmail.com>:
> > > Why would you make decision based on "Referrer" which is an  
> > OPTIONAL header
> > > field in HTTP protocol? Making decision based on something that is
> > > "REQUIRED" may be more appropriate.
>
> > > On Tue, Jun 16, 2009 at 12:33 PM, Doug Williams <d...@twitter.com>  

Matt Sanford

unread,
Jun 17, 2009, 1:15:10 PM6/17/09
to twitter-deve...@googlegroups.com
Hi Craig,

I didn't know about the X-Twitter-Client headers, thanks for the
info.

Thanks;
– Matt Sanford / @mzsanford
Twitter Dev

Doug Williams

unread,
Jun 17, 2009, 1:44:48 PM6/17/09
to twitter-deve...@googlegroups.com
Craig,
That is an excellent example of what we would like to see. You've identified your application and given us the URL to learn about it. Perfect.

Thanks for sharing.

Doug

Marco Kaiser

unread,
Jun 17, 2009, 5:19:49 PM6/17/09
to twitter-deve...@googlegroups.com
Doug,

thanks for clarification. So we'll continue to send a reasonable User Agent header from our apps, and assume that for now you won't reject all calls without them. We can of course also add the X-Twitter-* headers, as Craig suggests, in our next releases.

Thanks,

Dewald Pretorius

unread,
Jun 17, 2009, 7:41:09 PM6/17/09
to Twitter Development Talk
Setting the user agent is not only in the best interest of Twitter.
It's in your best interest as well.

I've been setting my user agent from almost day #1 of my service, and
on several occasions it has helped me to get quick response and issue
resolution from the API team for both REST and Search API issues,
because they could easily locate and isolate my calls in their access
logs.

This is something only the spammers need to worry about. For bona fide
services this is a big plus.

Best,

Dewald

feesta

unread,
Jun 24, 2009, 1:00:01 PM6/24/09
to Twitter Development Talk
Hi all,
I'm fairly new to app development and am working with Google Appengine
at the moment. My app (http://www.twitwheel.com/) makes two calls to
the search API for each page view. I've just added the user agent to
my urlfetch calls. Do I still need to worry about the 100/hour rate
limit? I've only just started testing and am open to alternatives and
suggestions.
thanks!
-Jeff Easter


On Jun 17, 4:41 pm, Dewald Pretorius <dpr...@gmail.com> wrote:
> Setting theuseragentis not only in the best interest of Twitter.
> It's in your best interest as well.
>
> I've been setting myuseragentfrom almost day #1 of my service, and

Matt Sanford

unread,
Jun 24, 2009, 4:07:04 PM6/24/09
to twitter-deve...@googlegroups.com
Hi Jeff,

If you have added a User-Agent header you shouldn't have any
problems with the new restriction.

Thanks;
– Matt Sanford / @mzsanford
Twitter Dev

Reply all
Reply to author
Forward
0 new messages