Running is Invalid when using config.json

15 views
Skip to first unread message

bind...@gmail.com

unread,
May 4, 2019, 8:20:41 AM5/4/19
to pyspider-users
Hello,

I try to build my result-worker but it doesn't work. 
When I click "run", it just show "data:,on_start" and doesn't execute index_page or detail_page.
p.s. The code is work in debug and it also is work when don't use config.json(just execute "pyspider" in cmd).

Thx.

scheduler.log
[I 190504 20:06:03 scheduler:360] ETTODAY on_get_info {'min_tick': 0, 'retry_delay': {}, 'crawl_config': {'itag': 'v1.1', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'Accept-Encoding': 'gzip, deflate, sdch', 'Accept-Language': 'zh-CN,zh;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36'}}
[I 190504 20:06:03 scheduler:782] scheduler.xmlrpc listening on 0.0.0.0:23333
[I 190504 20:06:06 scheduler:808] new task ETTODAY:on_start data:,on_start
[I 190504 20:06:15 scheduler:808] new task ETTODAY:on_start data:,on_start

fetcher.log
[W 190504 20:06:03 __init__:67] redis DB must zero-based numeric index, using 0 instead
[I 190504 20:06:03 tornado_fetcher:638] fetcher starting...
[I 190504 20:06:03 tornado_fetcher:188] [200] ETTODAY:_on_get_info data:,_on_get_info 0s

processor.log
[I 190504 20:06:02 processor:211] processor starting...
[D 190504 20:06:03 project_module:145] project: ETTODAY updated.
[I 190504 20:06:03 processor:202] process ETTODAY:_on_get_info data:,_on_get_info -> [200] len:12 -> result:None fol:0 msg:0 err:None

config.json
{
 
"taskdb": "mysql+taskdb://admin:12...@127.0.0.1:3306/spider",
 
"projectdb": "mysql+projectdb://admin:12...@127.0.0.1:3306/spider",
 
"resultdb": "mysql+resultdb://admin:12...@127.0.0.1:3306/spider",
 
"message_queue": "redis://127.0.0.1:6379/db",
 
"phantomjs-proxy": "127.0.0.1:25555",
 
"result_worker": {
   
"result_cls": "my_result_worker.MyResultWorker"
 
}
}

run.sh
nohup pyspider -c config.json scheduler  >> log/scheduler.log 2>&1 &
nohup pyspider -c config.json phantomjs  >> log/phantomjs.log 2>&1 &
nohup pyspider -c config.json --phantomjs-proxy="127.0.0.1:25555" fetcher >> log/fetcher.log 2>&1 &
pyspider -c config.json processor >> log/processor.log 2>&1 &
pyspider -c config.json result_worker >> log/result_worker.log 2>&1 &
pyspider -c config.json webui >> log/webui.log 2>&1 &
Reply all
Reply to author
Forward
0 new messages