querry to run scray project outside the project and seting require for it

531 views
Skip to first unread message

Jaiprakash Singh

unread,
Jan 31, 2014, 8:37:46 AM1/31/14
to scrapy...@googlegroups.com
please help me i am  really stuck here ,
my code is ruining properly when i am running it   it its own directory i.e where my code is present , but i am unable to run it properly from other directory


code3forlinkparsing
└── code3forlinkparsing
    └── spiders
        ├── brand_info_by_date
        │   └── hand-bags31012014
        ├── brands_htmls
        │   └── hand-bags31012014
        ├── code1_brandcollection
        │   └── code1_brandcollection
        │       └── spiders
        ├── code2_scrolling
        │   └── code2_scrolling
        │       └── spiders
        │           └── dumpsss
        ├── item_details_csv 31012014
        ├── project2
        │   └── project2
        │       └── spiders
        └── project3
            └── project3
                └── spiders

want to run  code3forlinkparsing.spiders.project2/project2/spiders/mycode.py    from topmost  directory 

i tried

 scrapy  runspider  code3forlinkparsing.spiders.project2/project2/spiders/mycode.py 

but it skips its own setting  i.e setting of  project2.project2.settings.py




under setting i am using

BOT_NAME = 'project2'

SCRAPY_SETTINGS_MODULE = "project2.spiders"

SPIDER_MODULES = ['project2.spiders']
NEWSPIDER_MODULE = 'project2.spiders'

# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'project2 (+http://www.yourdomain.com)'


DOWNLOADER_MIDDLEWARES = {
    'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110,
    'project2.proxymiddle.ProxyMiddleware': 100,
}



Rolando Espinoza La Fuente

unread,
Jan 31, 2014, 9:32:53 AM1/31/14
to scrapy...@googlegroups.com
You have set SCRAPY_SETTINGS_MODULE in the shell before running scrapy command. If you are using linux/osx, and assuming the project2 package is in the PYTHONPATH, try this

    export SCRAPY_SETTINGS_MODULE=project2.settings
    scrapy crawl yourspider

This will work given that you can import project2.settings in python, you can verify with this:

    python
    >>> import project2.settings




--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Jaiprakash Singh

unread,
Feb 1, 2014, 12:03:47 AM2/1/14
to scrapy...@googlegroups.com
hey

really really thank to reply me, i did some of experiment on your  advice,  i made very basic project as shown in  http://doc.scrapy.org/en/0.20/intro/tutorial.html, then i did under following , yes i can import the setting but unable to run the crawler

=================================
tree -d tutorial

tutorial/
`-- tutorial
    `-- spiders



at top most directory i.e at first tutorial


step 1

>>> import os, sys
>>> os.system("export SCRAPY_SETTINGS_MODULE=tutorial.settings")
>>> sys.path.append("/home/user/tutorial/")
>>> import tutorial.settings

>>> os.system("scrapy crawl dmoz")

error :-ImportError: No module named tutorial.settings


>>> os.system("scrapy runspider dmoz_spider.py")

error :- ImportError: No module named tutorial.settings



===============================================

next step:-

on shell :- $export SCRAPY_SETTINGS_MODULE=tutorial.settings

on interpreter

>>> import os, sys
>>> sys.path.append("/home/user/tutorial/")
>>> import tutorial.settings

>>> os.system("scrapy crawl dmoz")
error :- ImportError: No module named tutorial.settings

>>>  os.system("scrapy runspider dmoz_spider.py")
error :- ImportError: No module named tutorial.settings

===================================================

next step:-

added
SCRAPY_SETTINGS_MODULE=tutorial.settings it in tutorial.settings.py

run same process
then same problem occur 


please advice me what should i do now?


--
You received this message because you are subscribed to a topic in the Google Groups "scrapy-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scrapy-users/COW-sqof5H0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scrapy-users...@googlegroups.com.

Rolando Espinoza La Fuente

unread,
Feb 1, 2014, 8:40:30 AM2/1/14
to scrapy...@googlegroups.com
When you use os.system("scrapy ...") you are starting a new python process which doesn't inherit the sys.path modification. And when you do os.system("export ...") you are not modifying the current process environment variables, that's done through os.environ but, usually, a sub process will not inherit those variables.

You should do something like this:

$ export SCRAPY_SETTINGS_MODULE=tutorial.settings
$ export PYTHONPATH=/home/user/tutorial/
$ scrapy crawl dmoz

If you need to start the scrapy process from python then you have set up the environment variables for the sub process, see [1] or [2].


Jaiprakash Singh

unread,
Feb 4, 2014, 3:40:06 AM2/4/14
to scrapy...@googlegroups.com
yes sir , this method is really working . thank you very vary  much .
Reply all
Reply to author
Forward
0 new messages