scrapyd - Cannot import scrapy settings module

2,592 views
Skip to first unread message

Mitja Kramberger

unread,
Aug 7, 2011, 8:00:51 AM8/7/11
to scrapy-users
I'm running scrapyd. And when I try to deploy scraper using: scrapy
deploy default -p scraper

i get this error
{"status": "error", "message": " warnings.warn(\"Cannot import scrapy
settings module %s\" % scrapy_module)"}

with debug on I also get this:

Deploying scraper-1312715642 to http://localhost:6800/addversion.json
Server response (200):
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapyd/webservice.py", line 17, in render
return JsonResource.render(self, txrequest)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapy/utils/txweb.py", line 10, in render
r = resource.Resource.render(self, txrequest)
File "/usr/local/lib/python2.7/dist-packages/Twisted-11.0.0-py2.7-
linux-x86_64.egg/twisted/web/resource.py", line 216, in render
return m(request)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapyd/webservice.py", line 46, in render_POST
spiders = get_spider_list(project)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapyd/utils.py", line 59, in get_spider_list
raise RuntimeError(msg.splitlines()[-1])
RuntimeError: warnings.warn("Cannot import scrapy settings module
%s" % scrapy_module)

Mitja Kramberger

unread,
Aug 7, 2011, 3:55:54 PM8/7/11
to scrapy-users
If I add assert in get_spider_list (this is where the error happens. I
get this:

Deploying scraper-1312746799 to http://localhost:6800/addversion.json
Server response (200):
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapyd/webservice.py", line 17, in render
return JsonResource.render(self, txrequest)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapy/utils/txweb.py", line 10, in render
r = resource.Resource.render(self, txrequest)
File "/usr/local/lib/python2.7/dist-packages/Twisted-11.0.0-py2.7-
linux-x86_64.egg/twisted/web/resource.py", line 216, in render
return m(request)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapyd/webservice.py", line 46, in render_POST
spiders = get_spider_list(project)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.12.0.2542-
py2.7.egg/scrapyd/utils.py", line 57, in get_spider_list
assert False, (out, err)
AssertionError: ('Scrapy 0.12.0.2542 - no active project\n\nUnknown
command: list\n\nUse "scrapy" to see available commands\n\nMore
commands are available in project mode\n', '/usr/local/lib/python2.7/
dist-packages/Scrapy-0.12.0.2542-py2.7.egg/scrapy/utils/project.py:17:
UserWarning: Cannot import scrapy settings module scraper.settings\n
warnings.warn("Cannot import scrapy settings module %s" %
scrapy_module)\n')

Thanks for any info ...

On Aug 7, 2:00 pm, Mitja Kramberger <mitja.kramber...@vizius.si>
wrote:
> I'm running scrapyd. And when I try to deploy scraper using: scrapy
> deploy default -p scraper
>
> i get this error
> {"status": "error", "message": "  warnings.warn(\"Cannot import scrapy
> settings module %s\" % scrapy_module)"}
>
> with debug on I also get this:
>
> Deploying scraper-1312715642 tohttp://localhost:6800/addversion.json

Daniel

unread,
Aug 8, 2011, 11:23:49 PM8/8/11
to scrapy-users
I'm at the very same situation and can't make it work.

I'd appreciate if anyone could help too.

Thanks!

On Aug 7, 4:55 pm, Mitja Kramberger <mitja.kramber...@vizius.si>
wrote:
> If I add assert in get_spider_list (this is where the error happens. I
> get this:
>
> Deploying scraper-1312746799 tohttp://localhost:6800/addversion.json

Daniel

unread,
Aug 10, 2011, 4:51:24 PM8/10/11
to scrapy-users
Mitja, I solved this doing the following:

- On scrapy.cfg:
[settings]
default = full.path.of.packages.myproject.settings (I was using just
myproject.settings)

- Set the SCRAPY_SETTINGS_MODULE environment variable with the same
value: full.path.of.packages.myproject.settings

- Add a python path to the root folder of the above path: echo /home/
me/root_folder > SITE-PACKAGES-DIR/root_folder.pth

Maybe it could be more clear on the Docs??

Hope it helps :)

Regards,

Daniel

Alan

unread,
Oct 8, 2011, 1:34:15 PM10/8/11
to scrapy-users
I'm having this same problem. Could you clarify what is meant by
"full.path.of.packages"? Is this the path of the project on the local
file machine?

Any help would be greatly appreciated.

Thanks!

Alan

mandrare

unread,
Oct 10, 2011, 10:50:14 AM10/10/11
to scrapy...@googlegroups.com
Yes, I'm running into the same issue. Please clarify what the full path is. Thanks.

Steven Almeroth

unread,
Oct 11, 2011, 3:47:00 PM10/11/11
to scrapy...@googlegroups.com
The full path usually means the absolute path from the root directory, slash (/) in unix and C:\ in Windows.  In the case of a project on my Ubuntu machine located at /srv/scrapy/dirbot (notice the leading slash, meaning the full path from the root), the config file might set `default = srv.scrapy.dirbot.settings`.

But this would imply that each directory is a Python module and the following files would be required:

    /srv/__init__.py
    /srv/scrapy/__init__.py

Not only that, the Python import search path would have to include the root directory ('/'), which most installations, by default, certainly do not.  On my system it looks more like:

>>> sys.path

    ['/usr/lib/python2.7', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/pymodules/python2.7', '']

There is essentially no notion of absolute (full) path or relative path (no leading slash) in Python's module importing syntax.  Only when Python goes to the operating system to access the file, does the concept of absolute/relative path get involved.  Python just walks the PYTHONPATH and imports the first module it finds matching the names you give.

Assuming our current working directory is /srv/scrapy, and using the following directory structure from the Scrapy documentation sample project:

    /srv/scrapy/
    ├── dirbot
    │   ├── __init__.py
    │   ├── items.py
    │   ├── pipelines.py
    │   ├── settings.py
    │   └── spiders
    │       ├── dmoz.py
    │       ├── googledir.py
    │       └── __init__.py
    ├── README.rst
    └── scrapy.cfg

Let's look at the config file:

scrapy.cfg

    [settings]
    default = dirbot.settings

Using the PYTHONPATH above, importing is attempted in this order:

    /usr/lib/python2.7/dirbot/settings.py
    /usr/local/lib/python2.7/dist-packages/dirbot/settings.py
    /usr/lib/python2.7/dist-packages/dirbot/settings.py
    /usr/lib/pymodules/python2.7/dirbot/settings.py
    dirbot/settings.py

Using the given import expression `dirbot.settings` the first four absolute paths are searched and may fail to find a module. The fifth path is actually a relative path which is where our settings file is located.  Note: based on our current working directory, the relative path dirbot/settings.py in our case maps to the absolute path /srv/scrapy/dirbot/settings.py.

If your project does not have a scrapy.cfg file or your scrapyd service is not started from inside a project then you can set the environment variable SCRAPY_SETTINGS_MODULE using the same Python syntax (e.g. dirbot.settings).  Instead of using a long string for this variable simulating an absolute path and adding __init__.py files in all directories down to the root, we instead can use our regular dirbot.settings setting which is the module location from our project directory and then add the project directory to another environment variable PYTHONPATH which gets added to the module search list.

On unix we set environment variables with a shell like so:

    export SCRAPY_SETTINGS_MODULE=dirbot.settings
    export PYTHONPATH=/srv/scrapy

Denis K.

unread,
Oct 12, 2011, 3:19:19 PM10/12/11
to scrapy...@googlegroups.com
If you are importing some module in settings file (e.g. django) you have to install it on server before deploying

Alan

unread,
Oct 13, 2011, 4:35:00 AM10/13/11
to scrapy-users
Thanks - that was the issue. Can't believe I didn't think of it.
(Though the error message could have been clearer :) )
Reply all
Reply to author
Forward
0 new messages