scrapy server misbehaves when webservices defined in project

46 views
Skip to first unread message

Jordi Burguet Castell

unread,
Jul 20, 2012, 4:35:15 AM7/20/12
to scrapy...@googlegroups.com
Hi all,

I might have hit a bug in scrapy. When I run "scrapy server" and then
add a job with curl, it always fails with:
AssertionError: Scrapy settings already loaded
and that only happens when in my project's scrapyd.conf I use a
service in that same project directory tree.

That is, I have a service defined in scrapyd.conf that looks like:

[services]
info = myproj.scrapyd.webservice.Info

where myproj/scrapyd/webservice.py has a class

from scrapyd.webservice import WsResource
class Info(WsResource):
...


The full error log I get when I do
curl http://localhost:6800/schedule.json -d project=default -d spider=myspider
is:

2012-07-19 14:33:22+0200 [Launcher] Process started:
project='default' spider='myspider'
job='ecd13ed8d19d11e1a78a78843cebf6da' pid=14199
log='/mnt/sda6/scrapy/myproj/.scrapy/scrapyd/logs/default/myspider/ecd13ed8d19d11e1a78a78843cebf6da.log'
items='/mnt/sda6/myproj/.scrapy/scrapyd/items/default/myspider/ecd13ed8d19d11e1a78a78843cebf6da.jl'
2012-07-19 14:33:22+0200 [Launcher,14199/stderr] Traceback (most
recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/jordi/programs/lib/python2.7/site-packages/Scrapy-0.15.1-py2.7.egg/scrapyd/runner.py",
line 39, in <module>
main()
File "/home/jordi/programs/lib/python2.7/site-packages/Scrapy-0.15.1-py2.7.egg/scrapyd/runner.py",
line 34, in main
2012-07-19 14:33:22+0200 [Launcher,14199/stderr] with
project_environment(project):
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/home/jordi/programs/lib/python2.7/site-packages/Scrapy-0.15.1-py2.7.egg/scrapyd/runner.py",
line 26, in project_environment
assert 'scrapy.conf' not in sys.modules, "Scrapy settings already loaded"
AssertionError: Scrapy settings already loaded



This comes from an "assert 'scrapy.conf' not in sys.modules" in
scrapy/scrapyd/runner.py (line 26), that was introduced in commit
048044c1f8536b1e4d04712bba179053855294a8 :
* assert that scrapy configuration hasn't been loaded in scrapyd.runner

I'm not sure why this is so. I have tried to change it this way though:

try:
- assert 'scrapy.conf' not in sys.modules, "Scrapy settings
already loaded"
+ if 'scrapy.conf' in sys.modules:
+ sys.modules.pop('scrapy.conf')
yield

and it does work well for me. Is this a good idea? (Is it an
appropriate patch for scrapy?) Should I do it otherwise?

Thanks,
Jordi

Pablo Hoffman

unread,
Sep 4, 2012, 1:59:31 PM9/4/12
to scrapy...@googlegroups.com
Hi Jordi,

Are you able to reproduce this problem with testspiders project?


--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To post to this group, send email to scrapy...@googlegroups.com.
To unsubscribe from this group, send email to scrapy-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scrapy-users?hl=en.


Gilles Vandelle

unread,
Sep 7, 2012, 7:14:45 AM9/7/12
to scrapy...@googlegroups.com
Hi,

There are 2 types of web services available:
The WS described in the conf under [services] are activated in the main scrapy process by default on port 6800
The WS defined in you project only activated when you scehdule a spider and having a different port if you schedule multiple projects (608x).

If you define your WS in the service you should not add it in the project as well.

-Gilles
Reply all
Reply to author
Forward
0 new messages