Re: How to use project settings when calling Scrapy spiders from a python script

2,941 views
Skip to first unread message

Anderson Caco

unread,
Mar 26, 2013, 1:31:19 PM3/26/13
to scrapy...@googlegroups.com
think that your project settings are not being loaded. I made a small test and define SCRAPY_SETTINGS_MODULE didn't work for me, I don't know why. Instead of:

from scrapy.settings import Settings

I use something like

from scrapy.utils.project import get_project_settings as Settings


PS.: Hoffman, is this the recommended way? Also, since this seems to be a common issue related to the example provided in the common pratices,  should the docs provide more info about this?

2013/3/25 Scrapy_user <b.hees...@gmail.com>
Hello,

In the docs is the following example:

import os
os.environ.setdefault('SCRAPY_SETTINGS_MODULE', 'myprojectdir.settings')
from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy.settings import Settings
from scrapy import log
from testspiders.spiders.followall import FollowAllSpider

spider = FollowAllSpider(domain='scrapinghub.com')
crawler = Crawler(Settings())
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run() # the script will block here


I defined a custom USER_AGENT in the settings, and when I do this in the script:
print Settings().get('USER_AGENT')

it gives me the default Scrapy user agent, not the one I defined in the myprojectdir.settings file.

The current directory is also in my PYTHONPATH so myprojectdir.settings can be imported when running a python interactive shell.

What am I doing wrong here?

Thanks for this awesome software,

Scrapy_user.

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--

Anderson Ferraz
Reply all
Reply to author
Forward
0 new messages