Good to know, Ronnie!
It looks like most of the time is spent on following the redirections from Twitter. In fact, when you click one of those links you actually see it takes a while to get you to the final page.
I've done some benchmarks, take a look. Would be great to see your results in production as well:
Here I'm running 10 times each report to get an average, as you can see it takes almost 5 seconds just to initialize it as it has to make a request to
t.co and then follow the redirects to get to the final page.
On the second line, I'm doing the same but with the resolved URL, in that case it takes just 1 second!
On the third line, I initialize it passing the document contents (so no request is done), in that case it just takes 0.14 seconds.
And on the rest of the lines I measure how several parsings take, once you have initialized the document. You can see that this time is very small compared to what a request takes.
Sure we could optimize the parsing code, but this is not going to change much as long as we have to request pages and resolve redirections.