overridden settings

80 views
Skip to first unread message

Malik Rumi

unread,
May 27, 2017, 2:02:49 PM5/27/17
to scrapy-users

LOG:

2017-05-27 17:54:40 [scrapy.utils.log] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'acquire2.spiders', 'FEED_EXPORT_ENCODING': 'utf-8', 'FEED_FORMAT': 'json', 'FEED_URI': 'new_vol350feeds.json', 'SPIDER_MODULES': ['acquire2.spiders'], 'ROBOTSTXT_OBEY': True, 'BOT_NAME': 'acquire2'}



1. Why am I getting this ‘overridden settings’ notice? I get it all the time on everything, whether I run a script with scrapy crawl or in the shell. I have not tried to create any settings of any kind in any place other than the settings that came with Django and Scrapy. I infer that the spiders are the ones overriding my settings, but again, why? How? I never put any custom settings on them and I’ve never seen a model of how to write a spider that imports settings. They don’t even do it on this page from the docs: https://doc.scrapy.org/en/latest/topics/spiders.html#topics-spiders or in the tutorials. That leaves the possibility that the spider is overriding them by default, but, laying aside the fact that that makes no sense to me and seems to violate the explicit v implicit rule, am I supposed to now import my settings on every spider from now on?



2. I get that I need an environmental variable pointing to my SCRAPY_SETTINGS_MODULE. I see that being talked about all over the place. What I don’t see is where to put this.


Should it go in my scrapy.cfg? That already has the correct default in there, although it does not say ‘SCRAPY_SETTINGS_MODULE’.


Should it go in the settings file? I’m not sure that makes sense either, since it would be pointing to itself, and doesn’t scrapy already know where it’s settings file is?


On the spider? (See first part of this question) Each one? In the __init__?


How about in the django settings file? I would be loathe to do that for fear of screwing things up, but as it stands now my spider settings do point to the django settings, so it might make sense to reciprocate, if that doesn’t give me a circular import.


On pythonpath? The whole project is already on the PYTHONPATH, although, again, as with scrapy.cfg, there’s nothing there that explicitly says SCRAPY_SETTINGS_MODULE = scrapy.settings



3. I read in the docs that:

a) The settings attribute is set in the base Spider class after the spider is initialized.


Ok, what is it set to? The global defaults? And that automatically overrides my project settings?

and


b) If you want to use the settings before the initialization…

i. First, I’d have to know what the settings were to know if I wanted them or not.

ii. I thought that was what the project settings were for.

iii. How can you use a setting before the object is initialized?


So, my friends, clearly I don’t understand what is happening here, even if I read it in the docs. Your patient and detailed clarification is deeply appreciated.



Reply all
Reply to author
Forward
0 new messages