-------- Original Message --------
Subject: [gcd-contact] Script for ComicRack
Date: Mon, 7 Mar 2011 19:08:55 +0100
From: Maurizio <miz...@gmail.com>
Reply-To: gcd-c...@googlegroups.com
To: con...@comics.org
Hi guys,
I developed a quick script to scrape data from GCD. For the moment it has
been used by myself and a friend for testing, before publishing it on the CR
forum I'd like to have your green light.
It uses no API, since there are not !, but read chunks of pages
straightforwardly.
Are you interested in the code? I'm not a pro, though, so don't laugh ;-)
Let me know, I'll wait for your answer.
ciao,
Maurizio --
GCD-Contact mailing list - gcd-c...@googlegroups.com
To unsubscribe send email to gcd-contact...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/gcd-contact
Not that I'm a tech or nothing but web scraping is broken by design
almost but then if we don't have an API I guess its the only method.
It may cause additional load etc but I think overall it could assist
as long as we get a noteworthy reference. After all it is targeting
comic readers and collectors.
Mark
We offer for free the entire database contents for download already. We would
love to have an API and support other data download formats (among many other
features), and would welcome assistance in improving the site- please feel free
to join us on the gcd-...@googlegroups.com mailing list if you are interested.
thanks,
-henry
---
Henry Andrews
Lead Developer / Board Member, Grand Comics Database
> -- GCD-Tech mailing list - gcd-...@googlegroups.com
> To unsubscribe send email to gcd-tech+u...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/gcd-tech
>
following up on this.
What I am afraid of is that all ComicRack users (which I assume are
quite a number) by a push of a button would try to automatically fetch
the data for all their comics. This would surely impact the performance
and stability of our server.
If the fetching of the data would be on an issue basis, as in you would
try to get the data only for a given issue, the situation is somewhat
different since the impact on our server is similar to normal web user
behaviour. Even then, by scraping the website you download much more
data then needed. There might be ways to just get the data of an issue
in some form, we could talk about this further.
Jochen
Am 09.03.2011 04:48, schrieb Henry Andrews:
> Hi Maurizio,
> We would prefer that people not scrape the site as it increases the load on our