Streaming filter

69 views
Skip to first unread message

blue.pira...@gmail.com

unread,
Aug 28, 2015, 4:58:57 AM8/28/15
to Phirehose Users
 I'm currently undertaking a stage to project a twitter density map. I've used 140dev.com free source code for twitter database server (http://140dev.com/free-twitter-api-source-code-library/twitter-database-server/).

As I said, my aim is to get all the tweets in a specific area (Dublin and its suburbs) and display them on a map at the place they are associated with.
In the file get_tweet.php of 140dev.com, I've replaced the line   "$stream->setTrack(array(..."  with "$stream->setLocations(array..." . So I use a function coming from the Phirehose library to filter tweets according to their geographical origine. 

I'd like to catch all the tweets that are geolocated or associated to a place in the area of Dublin (which I defined as a rectangle with lat/long coordinates in the set location function). After searching on the Internet, I found that the the Streaming API can only give access to a sample of tweets that are currently created. Some sources affirm that this rate should be about 1% of the total of global tweets. 
But I found a link (https://twittercommunity.com/t/limit-on-streaming-tweets/8453/3) on twitter dev forum where a twitter staff member explains that if a filter is set and if the potential sample (using the total of global tweets) matching with the parameters of the filtring is under 1% of global tweets, then I'll get this sample in its entirety.

My first  question could seem stupid: Will I get all the tweets that are geolocated in the Dublin area with my code?
Indeed, I think that the total amount of tweets that match with my demand (that is to say geolocated tweets in the area of Dublin or tweets associated with a place in the Dublin area) is under 1% of the amount of global tweets but I'm not completely sure that the setLocation function is the kind of filter which is quoted in the forum link. You know, often in science, especially when someone try to popularise a theory, some words are misused. 

To conclude, I'm using a mysql database but I was adviced to use a postgresql database with pgadmin3 in order to work with map data and place tweets on a map. The problem is that postgresql is more difficult to use, I never used it and I I found no tutorial about using it with twitter streaming API. 
My second question is: do you think it's possible to continue my project with a mysql database (I'm supposed to use Leaflet to work with map and geo data but I've not begin this part yet) or should I migrate to a postgresql database? In either case, do you know some tutorials, software or anything that could help me?

Adam Green

unread,
Aug 28, 2015, 6:42:19 AM8/28/15
to phireho...@googlegroups.com
"All" can never be expected with the streaming API since it filters out some tweets for various reasons, and some accounts are invisible to it. You can expect a high percentage, approaching 98%. You should expect some false positives outside your range when using location. 

You will be requesting much less than 1% of the total tweet stream, so volume won't be an issue. 

The 140dev contains complete, documented code for MySQL and that is a fine choice for a database. 

Placing the tweets on a map will be your biggest challenge. Traditionally only about 1% of tweets are geocoded. So while most of the tweets will be in your requested area, they will not have data that lets you map them to a specific lat and long. 

Sent from my iPhone
--

---
You received this message because you are subscribed to the Google Groups "Phirehose Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phirehose-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fenn Bailey

unread,
Aug 30, 2015, 8:46:44 PM8/30/15
to phireho...@googlegroups.com
As Adam said, the biggest problem you're going to have is that the streaming API location function only returns geocoded tweet, which is a very small percentage of overall tweets.

That said, if you read the Search API docs (different API entirely), you'll see that it will "fall back" to user profile locations in cases where tweets aren't geocoded. It's possible they have updated the streaming API to act the same way, but it's not something I'm familiar with either way.

I would recommend tweeting both a geo-coded and non-geocoded tweet yourself and see whether it appears in both the search and streaming APIs to determine the functionality.

Good luck!

blue.pira...@gmail.com

unread,
Sep 1, 2015, 4:21:04 AM9/1/15
to Phirehose Users, blue.pira...@gmail.com
Thanks a lot  !!!
Reply all
Reply to author
Forward
0 new messages