Removing Authorization header value from future archives

13 views
Skip to first unread message

dar...@runscope.com

unread,
Mar 24, 2015, 1:39:48 AM3/24/15
to httpa...@googlegroups.com
Considering the Authorization header value is not something that should be shared, nor is it anything that can provide particularly useful statistics, would it be possible to exclude the value of this field from future archives?

I recognize that doing this will not remove all the credential information in the archive as there are plenty of X-Authorization headers, X-Apikey headers, api-key query string values, etc.  Trying to remove them all would be a game of whack-a-mole.  However, to remove the value from the official place that HTTP recommends credentials should be put, would have a secondary benefit of adding weight to the argument that you shouldn't invent your own header for the purpose. :-)

Thanks,

Darrel Miller

Patrick Meenan

unread,
Mar 24, 2015, 8:42:57 AM3/24/15
to httpa...@googlegroups.com
Is there a particular reason?  All the HTTP Archive does is load the landing page for each site in the list.  It doesn't do anything itself to authenticate/authorize and if the requests themselves get authorization headers it is no more than can be discovered just by running curl or phantom.  The pages that the HTTP Archive are loading should not have any special privileges.

If an API key is embedded in an Ajax request or something like that then the site has a lot bigger problems.

--
You received this message because you are subscribed to the Google Groups "HTTP Archive" group.
To unsubscribe from this group and stop receiving emails from it, send an email to httparchive...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Darrel Miller

unread,
Mar 25, 2015, 2:23:38 PM3/25/15
to httpa...@googlegroups.com
Patrick,

I understand that any site that delivers API keys down to a client is allowing those keys to be captured by anyone who really wants to get them.  However, HttpArchive makes it exceptionally easy to find the organizations that are taking this risk and enable gathering many instances of keys of API providers.

We know the Authorization header is designed to contain credentials, therefore if HttpArchive sees that header, it would be prudent not to store it, regardless of the wisdom of the site that put it there in the first place.  I guess I'm saying two wrongs don't make a right.

Is it an significant change to the HttpArchive process to do this kind of filtering?


Darrel
Reply all
Reply to author
Forward
0 new messages