Large Scale Crest Scraping

32 views
Skip to first unread message

Jet Balsa

unread,
Jun 12, 2016, 10:53:39 AM6/12/16
to EVE Market Data Relay
I've been reading a ton as of late about the current status of EMDR and I think we should work on a plan for large scale CREST scraping, the Local EVE Caches are going away soon and we should come up with a plan

I may be working on out of date information here since I've only started researching this topic last week.

Now scraping the CREST Api is not the hardest thing ever and it seems there are a ton of personal projects floating around. What if someone made a proper distributed uploader. Something with a master server where clients would get assigned a region of space to scan and upload the data. have it as a desktop app of some kind so we can get the general masses in on it so we can get the most data out of the system as we can. 

zweiz...@element-43.com

unread,
Jun 12, 2016, 4:15:12 PM6/12/16
to EVE Market Data Relay
Hey there - no need for distributed high-volume scraping - we got the new bulk endpoint now :) In fact I just finished the uploader - here's the code: https://github.com/zweizeichen/market_scraper I'm currently soak-testing and it's been feeding correctly formatted UUDIF messages directly into EMDR for about a day by now. There are tests for most parts of the application, although there still are some rough edges (e.g. search for FIXME comments). I would greatly appreciate contributions and please tell me if you encounter any problems with the data feed.

Gregory Taylor

unread,
Jun 12, 2016, 4:40:59 PM6/12/16
to eve-...@googlegroups.com
This is really cool, zwei.

I haven't played with erlang or elixir, so this is probably a dumb question but: I see there is one region monitor per region. Does this correspond to a process or is this some kind of co-routine/cooperative multi-tasking thing?

On Sun, Jun 12, 2016 at 1:15 PM, <zweiz...@element-43.com> wrote:
Hey there - no need for distributed high-volume scraping - we got the new bulk endpoint now :) In fact I just finished the uploader - here's the code: https://github.com/zweizeichen/market_scraper I'm currently soak-testing and it's been feeding correctly formatted UUDIF messages directly into EMDR for about a day by now. There are tests for most parts of the application, although there still are some rough edges (e.g. search for FIXME comments). I would greatly appreciate contributions and please tell me if you encounter any problems with the data feed.

--
You received this message because you are subscribed to the Google Groups "EVE Market Data Relay" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eve-emdr+u...@googlegroups.com.
To post to this group, send email to eve-...@googlegroups.com.
Visit this group at https://groups.google.com/group/eve-emdr.
For more options, visit https://groups.google.com/d/optout.



--

James Muscat

unread,
Jun 13, 2016, 5:59:53 AM6/13/16
to eve-...@googlegroups.com
FWIW I've been operating a CREST trawler (https://github.com/jamesremuscat/CRESTMarketTrawler) using the new bulk market endpoint for a week or so and feeding EMDR - the only problems I've had reported to me are that the sheer volume of data it's pushing has required some work from one of EMDR's consumers (not to mention my own database)...!

Sebastian / zweizeichen

unread,
Jun 13, 2016, 7:48:10 AM6/13/16
to eve-...@googlegroups.com
Thank you for operating the trawler! IIRC your data does not include solarSystemIDs - could you please fix that, if it still is an issue? In my opinion we would benefit from having multiple uploaders - we could even play around with signing the messages (JWT).

@gtaylor: yup - each region has its own process, it keeps the hashes of last run’s orders, so it can only submit a diff, instead of the whole market. I need to write better docs, I kinda rushed the release yesterday :) Also, I held a talk about the old high-volume scraper’s architecture if you’re interested: https://github.com/zweizeichen/talks/raw/master/2016-06-hhex%20Taming%20the%20API%20Scraper%20from%20Hell.pdf

You received this message because you are subscribed to a topic in the Google Groups "EVE Market Data Relay" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/eve-emdr/hC86teVFmmo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to eve-emdr+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages