web mining python scrapy job opportunity, Paris, France

Paul Girard

Jan 31, 2012, 1:11:05 PM1/31/12
to scrapy...@googlegroups.com
Dear scrapy community,

I am the digital manager of  médialab Sciences Po research lab based in Paris, France <http://medialab.sciences-po.fr>.
We are looking for a Python developer to help us develop our Open Source software project Hypertext Corpus Initiative.
This web mining tool dedicated to Social Sciences researchers is just starting and is based on scrapy.

We actually asked for help at Pablo and Shane from scrapinghub to set the bases of our new crawler.

To go on with this crazy project we need some help.
Feel free to ask for more info directly to me or to apply at medialab (at) sciences-po.fr

Below is the Job proposal in RST style ;-)

Hope that might interest some of you or your firends, 



**Job Description**

Junior Python developer for a web mining research lab in Social Sciences.
Full time, 3-years contract to be extended, Paris, France.
The very first mission of this position is to join the Hypertext Corpus Initiative (see http://jiminy.medialab.sciences-po.fr/hci)


* great coding skills
* python enthusiastic
* interesting in web mining
* interesting in social sciences research
* 0 to 5 years of experience
* willing to learn basics french
* willing to come to live in Paris France

**About the company**
The médialab Sciences Po is a research lab dedicated to invent new digital methodologies and tools for social sciences research.
We are integrating many different data mining algorithm in the context of social sciences, wrapping them in nicely designed interfaces.
We build research tools based on Information Technologies.

**What Python is used for**

We use Python for our data mining programs.
It's about scraping the web using scrapy, building a server with Twisted, exporting and analysing networks using NetworkX, talking to Java Lucene index engine trough Thrift (twisted mode)…
We also use Django for Web App, NLTK for Natural Language processing...

**Contact Info:**

* **Contact**: Paul Girard, digital manager médialab
* **E-mail contact**: medialab (at) sciences-po.fr
* **No telecommuting**

