Scrapy + Eclipse + PyDev: this is a working tutorial based on various
sources.
Hope it helps.
Sources:
http://techcolleague.com/2011/02/install-eclipse-and-pydev-on-ubuntu/
http://www.linoob.com/2011/09/starting-with-python-on-eclipse-in-ubuntu/
http://groups.google.com/group/scrapy-users/browse_thread/thread/4234de3bbc723f14/f9011bbffd3005d2?lnk=gst&q=eclipse&pli=1
1). Install Eclipse.
1.Install Eclipse and Sun’s Java by running the following:
sudo apt-get install eclipse sun-java6-jdk
2.Make Sun's Java the default by running the following:
sudo update-java-alternatives -s java-6-sun
3.Edit the JVM configuration file:
sudo vi /etc/jvm
and add the following to the top of the file:
/usr/lib/jvm/java-6-sun
4.If you have 1GB or more of memory, you may want to increase the
heap size for better performance.
The argument Xms refers to the minimum heap size and Xmx refers
to the maximum heap size.
Edit the Eclipse script file:
sudo vi /usr/bin/eclipse
and change VMARGS="" to the following:
VMARGS="-Xms512m -Xmx512m -XX:PermSize=128m -
XX:MaxPermSize=128m"
2).Install PyDev.
1. Run Eclipse.
2. For Eclipse Galileo:
Help->Install New Software->Add:
Name: PyDev
Location:
http://pydev.org/updates
Agree to License and install.
3. Configure the Python interpreter by going to Window |
Preferences | Pydev | Interpreter – Python.
4. In the Python interpreters section, click New.
5. Locate your Python interpreter (e.g., /usr/bin/python).
6. When prompted, select your System PYTHONPATH. You can just
leave the default checkboxes selected and click OK.
3).Install Scrapy:
http://doc.scrapy.org/
4).Run Scrapy in debug from Eclipse:
- Right click one of the files in your project, say, the setup.py
and choose
Debug As -> Python Run
- It will try and fail, then edit the newly created run /debug
configuration menu:
Run -> Debug Configurations
- Then change your Main Module(Run -> Debug Configurations ->
Main)
to where Scrapy is located, mine for example is at:
/usr/local/lib/python2.6/dist-packages/Scrapy-0.12.0.2542-
py2.6.egg/scrapy/cmdline.py
or
/usr/lib/pymodules/python2.6/scrapy/cmdline.py
So remove some ex default like with previous cmdline.py:
${workspace_loc:my_test/dmoz/setup.py}
- Last, set up the Scrapy arguments in the Arguments tab, for
example:
Program Arguments:
crawl spider_1 -a param1=val1 -a param2=val2
enjoy.
--vs