Re: Access to Scrapy command line arguments or command context

865 views
Skip to first unread message

William Yang

unread,
Aug 27, 2012, 10:08:19 AM8/27/12
to scrapy...@googlegroups.com

command line argument is passed to the spider using -a  switch. then u need also over ride the init of ur spider i think. sorry cant paste any code, i am on mobile. :(

happy googling

在 2012-8-27 上午3:49,"Иван Клешнин" <ivan.k...@gmail.com>写道:
How can i access parsed command-line arguments within a spider module?
-----------------------------------------------------------------------------------------------------------------------------

Suppose i want to parse any blog from http://livejournal.com with one single spider. 
I wrote smth. like this:

parser = argparse.ArgumentParser()
parser.add_argument('name', help='The name for the spider')
args = parser.parse_args()
name = args.name

class LivejournalSpider(CrawlSpider):
    name = name
    allowed_domains = [name + '.livejournal.com']
    start_urls = ['http://' + name + '.livejournal.com/']
    ...

Which works well with 
$ scrapy crawl someblog

But to support other scrapy command e.g. "parse" and command-line flags i need to manually handle all of possible situations right there...
So i need access to scrapy native arguments which are already parsed, or (even better) to command context. 
I tried to debug the whole application to find answer by myself, but my python skill is not enough. My brain is ready to explode when i see those managers and recursions ^_^



--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/scrapy-users/-/YP4vG7RBTYkJ.
To post to this group, send email to scrapy...@googlegroups.com.
To unsubscribe from this group, send email to scrapy-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scrapy-users?hl=en.

Pablo Hoffman

unread,
Sep 4, 2012, 1:54:54 PM9/4/12
to scrapy...@googlegroups.com
See this section about spider arguments which I've just added to the doc:
Reply all
Reply to author
Forward
0 new messages