(adding jbrout group to Cc so *this* message is archived)
On 09/16/2013 09:17 AM,
mana...@gmail.com wrote:
> Yes I understand now !
> I thought that gmane index google groups automatically ?!? no ?
> how can I extract messages from google group ?
No,
gmane.org automatically archives just messages which go through
(i.e., everything will be archived from now on).
However,
gmane.org doesn't know how to get old messages from Google
Groups when there is no API for getting messages from there. Actually,
gmane.org never fetches *old* messages from other systems. However, most
other archiving systems (including Yahoo! Groups, mailman, etc.) provide
at least for the owners of the group some method how to get messages out
of their system (and then you can import mbox to
gmane.org following
http://gmane.org/import.php). It is not so with Google Groups.
Google doesn't provide any API for accessing messages in the Groups
(
https://developers.google.com/apps-script/reference/groups/ is just for
administering membership etc., not for accessing messages). Moreover, it
is not possible to scrap messages from Groups anymore, because Google in
their infinite wisdom now requires JavaScript for accessing Google
Groups (e.g., lynx has been broken as well). The scripts for downloading
messages from Google Groups I've found
(
http://search.cpan.org/~xern/WWW-Google-Groups-0.09/ and
http://saturnboy.com/2010/03/scraping-google-groups/) are obsolete and
don't work anymore.
It could be theoretically possible to write Jetpack
(
https://addons.mozilla.org/en-US/developers/docs/sdk/latest/dev-guide/index.html)
which would do scrapping from inside of Firefox (or Google Chrome
equivalent), but it would be very complicated to do (e.g., scripts
cannot write to the local disk, so one would have to write also some
small http server for storing scrapped messages, it is quite probable
Google limits number of accesses per second to the Google Groups web
interface), and certainly nobody every wrote it.
Best,
Matěj
--
http://www.ceplovi.cz/matej/, Jabber:
mc...@ceplovi.cz
GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC
"Push to test." (click) "Release to detonate..."
-- from a bugzilla quip list