GSOC 2015: Support for Spider in other languages

46 views

Skip to first unread message

Leo Lv

unread,

Mar 19, 2015, 7:06:41 AM3/19/15

to scrapy...@googlegroups.com

Hi all,

I'm a junior student from Peking University who majors in Computer Science. I just got the information of the GSOC project recently...so sorry for posting this so late.

During my time at the university, I've done lots of research on the "crawler" topic and have joined the "Search Engine and Web Mining" group in my school to obtain more knowledge about crawling data. I always consider crawling the websites very interesting because it demands me to delve into of the particular websites and trace the data to see when and where they are formed, just like a detective work : )

I've went through the list of ideas you provided and found “Support for Spider in other languages” very fascinating! I have used many open-source crawlers in Java like Heritrix and Nutch. Honestly they are very difficult to configure and are not as extensible as Scrapy... So it will be a good news for Java developers if this idea comes true! Besides, I have some experience in Hadoop as well. The Hadoop streamline provides a convenient interface for developers in languages other than java to write their own Mapper/Reducer. I think it's a awesome and feasible idea to make Scrapy's spider as flexible as Hadoop's Mapper/Reducer.

So could anyone tell me whom I should contact next if I want to participate in this project? Hope it's not too late...

Thanks a lot!

- Leo

Shane Evans

unread,

Mar 24, 2015, 10:58:59 AM3/24/15

to scrapy...@googlegroups.com

Hi Leo,

I'm delighted to hear you're interested in this project. It's not too late!

This idea was also there last year, and there was some discussion on this list. I would recommend starting by reading those posts.

Please feel free to contact me with any further questions.

I am told that students can open issues on github with their proposals if they want feedback from our team.

Thanks,

Shane

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages