RDSTK- coordinates2politics-- Is there a limitation to the searching amount and speed?

70 views
Skip to first unread message

Yuying Song

unread,
Mar 25, 2015, 4:21:31 PM3/25/15
to dstk-...@googlegroups.com
Hi,

I am a R user and thank you very much for the great API!  I have two questions:

1. I have a  quite large dataset which contains about 170,000,000 pairs of longitude and latitude observations. I am using coordinates2politics in RDSTK to return state, county and city. However, when I searched too much in the local R in one day,
or call the coordinates2politics function in R in local for more than 20,000 times, the function coordinates2politics failed with error message "Recv failure: Connection reset by peer" . Is there a limitation for searching amount?

2. Then I moved to a cluster to work on the large dataset parallel. Then the function works quite unstable. sometimes it worked, sometimes it doesn't work. When I rerun it, it worked again but you don't know when it will fail.

3. Is there limitations for the searching amount or speed?

Thank you very much!!

Yuying

Pete Warden

unread,
Mar 26, 2015, 11:26:10 AM3/26/15
to dstk-...@googlegroups.com
Hi Yuying,
                 I do have some throttling on the main www.datasciencetoolkit.org server, since it's designed to be a shared resource for prototyping and experimenting, rather than a service to be used at large scale. I try to have pretty high limits, so it can also get overwhelmed at times.

For your scale of usage, I'd recommend spinning up your own EC2 instance so that you have it all to yourself, and potentially can spin up multiple servers behind a load-balancer to increase your speed even more.

Does that help?

Pete

--
You received this message because you are subscribed to the Google Groups "dstk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dstk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Check out Jetpac City Guides iPhone app - Just launched!

CTO Jetpac
Follow me on twitter @petewarden

Yuying Song

unread,
Mar 26, 2015, 11:34:38 PM3/26/15
to dstk-...@googlegroups.com, pe...@jetpac.com
Dear Pete,

Thank you very much!! I am not familiar with EC2 at all. When you suggested having my own EC2 instance, does that mean I need to have my own database? But how?

I am truly sorry for the trouble.

Sincerely,

Yuying

Pete Warden

unread,
Mar 27, 2015, 11:35:33 AM3/27/15
to Yuying Song, dstk-...@googlegroups.com
Hi Yuying,
                 yes, you would need to learn how to start a virtual machine in Amazon's Web Service infrastructure. Once you've learned that, you can then pick the ID of a DSTK copy, and start up a duplicate of www.datasciencetoolkit.org on your own machine.

Pete
Reply all
Reply to author
Forward
0 new messages