0 spider when deploy with scrapyd / scrapyd-client

Arnaud Knobloch

unread,

Feb 16, 2017, 2:01:15 PM2/16/17

to scrapy-users

Hi there,

I created my first scrapy project. I have an Ubuntu 16.04 server. I installed scrapyd and scrapyd-client with pip (depency problems with apt-get).

When I deploy, there is no spider available...

scrapyd-deploy tre -p m_scrapy

fatal: No names found, cannot describe anything.

Packing version r14-master

Deploying to project "m_scrapy" in http://IP:6800/addversion.json

Server response (200):

{"status": "ok", "project": "m_scrapy", "version": "r14-master", "spiders": 0, "node_name": "Tre"}

fatal: No names found, cannot describe anything. --> Seems not important. I'm using version = GIT in my scrapy.cfg and I don't have annotated tag.

curl http://IP:6800/schedule.json -d project=m_scrapy -d spider=m_scrapy

{"status": "error", "message": "spider 'm_scrapy' not found"}

curl http://IP:6800/listprojects.json

{"status": "ok", "projects": ["m_scrapy"], "node_name": "Tre"}

curl http://IP:6800/listspiders.json?project=m_scrapy

{"status": "ok", "spiders": [], "node_name": "Tre"}

Scrapy.cfg

[settings]

default = m_scrapy.settings

[deploy:local]

url = http://localhost:6800/

project = m_scrapy

version = GIT

[deploy:tre]

url = http://IP:6800/

project = m_scrapy

version = GIT

Setting.py

BOT_NAME = 'm_scrapy'

SPIDER_MODULES = ['m_scrapy.spiders']

NEWSPIDER_MODULE = 'm_scrapy.spiders'

ITEM_PIPELINES = {

'm_scrapy.pipelines.MPhoneImagePipeline':100,

'm_scrapy.pipelines.MAdItemPipeline':200,

'm_scrapy.pipelines.MAdImagesPipeline':300

}

MPHONEIMAGEPIPELINE_IMAGES_URLS_FIELD = 'phone_image_url'

MPHONEIMAGEPIPELINE_RESULT_FIELD = 'phone_image'

COOKIES_DEBUG = True

LOG_ENABLED = True

LOG_LEVEL = 'WARNING'

LOG_STDOUT = False

LOG_FILE = "%s_%s.log" % (BOT_NAME, time.strftime('%d-%m-%Y'))

IMAGES_EXPIRES = 0

MIMAGESPIPELINE_IMAGES_EXPIRES = 0

When I'm using on my computer the crawler work fine.

I already tried to delete project-egg.info, setup.py and the build folder.

Another question is: I don't have any .egg file in build, is it normal?

Thanks!

Nikolaos-Digenis Karagiannis

unread,

Feb 17, 2017, 11:08:18 AM2/17/17

to scrapy-users

To keep the egg, you have to pass the debug argument to scrapyd-deploy.
Try it and look if the spider is put in the egg at all.
Does the command `scrapy list` work?

Arnaud Knobloch

unread,

Feb 18, 2017, 5:05:21 AM2/18/17

to scrapy-users

Hi,

scrapyd-deploy tre -p m_scrapy -d

fatal: No annotated tags can describe '852e945dcf15dc47652aa0164511bb36e9dd7ae3'.

However, there were unannotated tags: try --tags.

Packing version r15-master

Deploying to project "m_scrapy" in http://IP:6800/addversion.json

Server response (200):

{"status": "ok", "project": "m_scrapy", "version": "r15-master", "spiders": 0, "node_name": "Tre"}

Output dir not removed: /var/folders/p3/tbhnqdq9019dbfwjtkrbb50r0000gn/T/scrapydeploy-19AIrF

Where is the egg? I don't find anything.

The command scrapy list return the name of my spider.

Arnaud Knobloch

unread,

Feb 18, 2017, 5:19:55 AM2/18/17

to scrapy-users

I deleted the folder tbhnqdq9019dbfwjtkrbb50r0000gn and tried again. This time it was:

Server response (200):

{"status": "ok", "project": "m_scrapy", "version": "r15-master", "spiders": 0, "node_name": "Tre"}

Output dir not removed: /tmp/scrapydeploy-njQRX9

I went to this folder, rename the .egg to .zip, extract it and I can see my spider in spiders folder. But again it say "spiders": 0

Arnaud Knobloch

unread,

Feb 19, 2017, 1:03:10 PM2/19/17

to scrapy-users

I tried severals things but I didn't find a way to deploy my spider. Using curl http://IP:6800/addversion.json give me 0 spider too.

Here some more info if someone can help me.

Stderr

'build/scripts-2.7' does not exist -- can't clean it

zip_safe flag not set; analyzing archive contents...

Stout

copying build/lib/m_scrapy/pipelines.py -> build/bdist.macosx-10.11-x86_64/egg/m_scrapy

copying build/lib/m_scrapy/settings.py -> build/bdist.macosx-10.11-x86_64/egg/m_scrapy

creating build/bdist.macosx-10.11-x86_64/egg/m_scrapy/spiders

copying build/lib/m_scrapy/spiders/__init__.py -> build/bdist.macosx-10.11-x86_64/egg/m_scrapy/spiders

copying build/lib/m_scrapy/spiders/m_spider.py -> build/bdist.macosx-10.11-x86_64/egg/m_scrapy/spiders

byte-compiling build/bdist.macosx-10.11-x86_64/egg/m_scrapy/__init__.py to __init__.pyc

byte-compiling build/bdist.macosx-10.11-x86_64/egg/m_scrapy/items.py to items.pyc

byte-compiling build/bdist.macosx-10.11-x86_64/egg/m_scrapy/pipelines.py to pipelines.pyc

byte-compiling build/bdist.macosx-10.11-x86_64/egg/m_scrapy/settings.py to settings.pyc

byte-compiling build/bdist.macosx-10.11-x86_64/egg/m_scrapy/spiders/__init__.py to __init__.pyc

byte-compiling build/bdist.macosx-10.11-x86_64/egg/m_scrapy/spiders/m_spider.py to m_spider.pyc

creating build/bdist.macosx-10.11-x86_64/egg/EGG-INFO

copying project.egg-info/PKG-INFO -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO

copying project.egg-info/SOURCES.txt -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO

copying project.egg-info/dependency_links.txt -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO

copying project.egg-info/entry_points.txt -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO

copying project.egg-info/top_level.txt -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO

creating '/var/folders/p3/tbhnqdq9019dbfwjtkrbb50r0000gn/T/scrapydeploy-4p418b/project-1.0-py2.7.egg' and adding 'build/bdist.macosx-10.11-x8$

removing 'build/bdist.macosx-10.11-x86_64/egg' (and everything under it)

Nikolaos-Digenis Karagiannis

unread,

Mar 8, 2017, 5:51:03 AM3/8/17

to scrapy...@googlegroups.com

Hi,

Sorry for the late reply.

It looks like the spider is indeed packaged in the egg.

There's a bug when definining LOG_STDOUT=True

but this is not the case for you.

Does it work if you remove the line `LOG_FILE =...`?

--
You received this message because you are subscribed to a topic in the Google Groups "scrapy-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scrapy-users/JlfoQ5LlAl4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scrapy-users+unsubscribe@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Arnaud Knobloch

unread,

Mar 8, 2017, 7:29:23 AM3/8/17

to scrapy-users

If finally found what was the issue. If this can help someone else.

I have "import pytesseract" in my spider file. I forgot to install this on my server. In this case, when there is an import error from the spider file (pytesseract or pil or whatever), you'll deploy 0 spider and you don't have any logs. If it's an error import somewhere else (pipelines for example) you'll have some logs to help you.

To unsubscribe from this group and all its topics, send an email to scrapy-users...@googlegroups.com.

Nikolaos-Digenis Karagiannis

unread,

Mar 8, 2017, 11:20:30 AM3/8/17

to scrapy...@googlegroups.com

Hi,

This is definitely a bug

and if it works as you just described

there should be many users stumbling on it.

Thank you for the update

and for discovering it.

Do you mind reporting it in the issue tracker?
https://github.com/scrapy/scrapyd

To unsubscribe from this group and all its topics, send an email to scrapy-users+unsubscribe@googlegroups.com.

Paul Tremberth

unread,

Mar 9, 2017, 6:36:40 AM3/9/17

to scrapy...@googlegroups.com

Hello Arnaud,

what version of Scrapy are you using?
1.3.0 has this change: https://github.com/scrapy/scrapy/pull/2433 that swallows exceptions on spider modules import,

and which is very unfortunately not mentioned in the release notes (totally my bad, I missed it)

It looks like this change was a bad move since users do rely on import errors for `scrapy list` for example.

And this seems the case here for scrapyd especially.

Could you check with scrapy 1.2.2?

There is this proposal change to have the warning only optional,

and by default for some scrapy commands which do not need to load spiders (e.g. scrapy version)

https://github.com/scrapy/scrapy/pull/2632

If it gets approved, I think I'll backport it to 1.3 branch.

Best,

Paul.

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscribe@googlegroups.com.

mbkv

unread,

Apr 27, 2017, 11:29:35 AM4/27/17

to scrapy-users

Hello guys,

Please check the comments at https://github.com/scrapy/scrapyd-client/issues/43

regards,

Mani

--

You received this message because you are subscribed to the Google Groups "scrapy-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.

Reply all

Reply to author

Forward