Details page giving this FAILED error

26 views
Skip to first unread message

ecom4...@gmail.com

unread,
Aug 18, 2020, 4:35:19 PM8/18/20
to pyspider-users
Hi.
I was scraping succesfully some pages from yelp, but now I receive these error:

Any idea what is about?

Thanks.

============================

taskid
    c7e23a0d223db82febf4d2b1c4920217
lastcrawltime
    1597782099.006718 (5 seconds ago)
updatetime
    1597782099.0067499 (5 seconds ago)
exetime
    1597782129.006716 (1 second ago)
track.fetch 2848.18ms
    {
      "content": "<!DOCTYPE HTML>\n\n<!--[if lt IE 7 ]> <html xmlns:fb=\"http://www.facebook.com/2008/fbml\" class=\"ie6 ie ltie9 ltie8 no-js\" lang=\"en\"> <![endif]-->\n<!--[if IE 7 ]>    <html xmlns:fb=\"http://www.facebook.com/2008/fbml\" class=\"ie7 ie ltie9 ltie8 no-js\" lang=\"en\"> <![endif]-->\n<!--[if IE 8 ]>    <html xmlns:fb=\"http://www.facebook.com/2008/fbml\" class=\"ie8 ie ltie9 no-js\" lang=\"en\"> <![endif]-->\n<!--[if IE 9 ]>    <html xmlns:fb=\"http://www.facebook.com/2008/fbml\" class=\"ie9 ie no-js\" lang=\"en\"> <![end",
      "encoding": "UTF-8",
      "error": null,
      "headers": {
        "Accept-Ranges": "bytes,bytes",
        "Age": "0,0",
        "Cache-Control": "private, no-transform",
        "Connection": "keep-alive",
        "Content-Encoding": "gzip",
        "Content-Security-Policy": "report-uri https://www.yelp.com/csp_block?id=5c86e55864b8b6ca&page=enforced_by_default_directives&policy_hash=7b6f2d6630868fdb2698dac44731677c&site=www&timestamp=1597782096; object-src 'self'; base-uri 'self' https://*.yelpcdn.com https://*.adsrvr.org https://6372968.fls.doubleclick.net; font-src data: 'self' https://*.yelp.com https://*.yelpcdn.com https://fonts.gstatic.com https://connect.facebook.net https://cdnjs.cloudflare.com https://apis.google.com https://www.google-analytics.com https://use.typekit.net https://player.ooyala.com https://use.fontawesome.com https://maxcdn.bootstrapcdn.com https://fonts.googleapis.com",
        "Content-Security-Policy-Report-Only": "report-uri https://www.yelp.com/csp_report_only?id=5c86e55864b8b6ca&page=csp_report_frame_directives%2Cfull_site_ssl_csp_report_directives&policy_hash=3275ba4c5b0741fb6e8d1b21e9975e80&site=www&timestamp=1597782096; frame-ancestors 'self' https://*.yelp.com; default-src https:; img-src https: data: https://*.adsrvr.org; script-src https: data: 'unsafe-inline' 'unsafe-eval' blob:; style-src https: 'unsafe-inline' data:; connect-src https:; font-src data: 'self' https://*.yelp.com https://*.yelpcdn.com https://fonts.gstatic.com https://connect.facebook.net https://cdnjs.cloudflare.com https://apis.google.com https://www.google-analytics.com https://use.typekit.net https://player.ooyala.com https://use.fontawesome.com https://maxcdn.bootstrapcdn.com https://fonts.googleapis.com; frame-src https: yelp-webview://* yelp://* data:; child-src https: yelp-webview://* yelp://*; media-src https:; object-src 'self'; worker-src blob: https:; base-uri 'self' https://*.yelpcdn.com https://*.adsrvr.org https://6372968.fls.doubleclick.net; form-action https: 'self'",
        "Content-Type": "text/html; charset=UTF-8",
        "Date": "Tue, 18 Aug 2020 20:21:38 GMT",
        "Link": "https://s3-media0.fl.yelpcdn.com; rel=preconnect,https://www.google-analytics.com; rel=preconnect",
        "Referrer-Policy": "origin-when-cross-origin",
        "Server": "Apache",
        "Set-Cookie": "pid=; Domain=.yelp.com; Max-Age=0; Path=/; expires=Wed, 31-Dec-97 23:59:59 GMT,bse=1863b39b59ad467bb365fea645869fc8; Domain=.yelp.com; Path=/; HttpOnly,hl=en_US; Domain=.yelp.com; Max-Age=630720000; Path=/; expires=Mon, 13-Aug-2040 20:21:38 GMT,sc=232c577697; Path=/,wdi=1|857349249E4BF1E9|0x1.7cf0e13fdab4bp+30|59ec28c1588fd0ff; Domain=.yelp.com; Path=/; Max-Age=630720000; Expires=Mon, 13 Aug 2040 20:21:38 GMT; HttpOnly; SameSite=Lax",
        "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload",
        "Transfer-Encoding": "chunked",
        "Vary": "User-Agent, Accept-Encoding",
        "Via": "1.1 varnish",
        "X-B3-Sampled": "0",
        "X-Cache": "MISS",
        "X-Cache-Hits": "0",
        "X-Content-Type-Options": "nosniff",
        "X-Extlb": "10-65-66-213-useast1aprod",
        "X-Http-Reason": "OK",
        "X-Mode": "ro",
        "X-Node": "www_all,10-65-186-97-useast1bprod-46dee827-e178-11ea-885b-024237c818",
        "X-Proxied": "10-65-66-213-useast1aprod",
        "X-Routing-Service": "routing-main--useast1-5c677f9676-mr4jq; site=www",
        "X-Served-By": "cache-ams21063-AMS",
        "X-Timer": "S1597782096.913728,VS0,VE2409",
        "X-Xss-Protection": "1; report=https://www.yelp.com/xss_protection_report",
        "X-Zipkin-Id": "7282a5f685420a19"
      },
      "ok": true,
      "redirect_url": null,
      "status_code": 200,
      "time": 2.8481838703155518
    }
track.process 117.67ms
    ''
    [E 200818 16:21:38 base_handler:203] ''
        Traceback (most recent call last):
          File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py", line 196, in run_task
            result = self._run_task(task, response)
          File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py", line 176, in _run_task
            return self._run_func(function, response, task)
          File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py", line 155, in _run_func
            ret = function(*arguments[:len(args) - 1])
          File "<FR_04>", line 264, in detail_page
        KeyError: ''

    {
      "exception": "''",
      "follows": 0,
      "logs": "[E 200818 16:21:38 base_handler:203] ''\n    Traceback (most recent call last):\n      File \"/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py\", line 196, in run_task\n        result = self._run_task(task, response)\n      File \"/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py\", line 176, in _run_task\n        return self._run_func(function, response, task)\n      File \"/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py\", line 155, in _run_func\n        ret = function(*arguments[:len(args) - 1])\n      File \"<FR_04>\", line 264, in detail_page\n    KeyError: ''\n",
      "ok": false,
      "result": null,
      "time": 0.11767077445983887
    }

schedule
    {
      "exetime": 1597782129.006716,
      "priority": 2,
      "retried": 1
    }
fetch
    {}
process
    {
      "callback": "detail_page"
    }

============================



WHEN IN DEBUG AND TESTING THE PYSPIDER... IS THAT:

[E 200818 16:34:37 base_handler:203] ''
    Traceback (most recent call last):
      File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py", line 196, in run_task
        result = self._run_task(task, response)
      File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py", line 176, in _run_task
        return self._run_func(function, response, task)
      File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/base_handler.py", line 155, in _run_func
        ret = function(*arguments[:len(args) - 1])
      File "<PL_02>", line 264, in detail_page
    KeyError: ''



Roy Binux

unread,
Aug 18, 2020, 5:09:22 PM8/18/20
to Joseph, pyspider-users
In your code, line 264

--
You received this message because you are subscribed to the Google Groups "pyspider-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyspider-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pyspider-users/fe5a1598-0d03-43c8-a488-5549d62f0c7eo%40googlegroups.com.

efca...@gmail.com

unread,
Aug 18, 2020, 5:47:23 PM8/18/20
to pyspider-users
THANK you.

Yes, I started to annalyze from there and I found that is not getting the html from the website.
Doing some adjustments now and parsing some results.
To unsubscribe from this group and stop receiving emails from it, send an email to pyspide...@googlegroups.com.

Siroj bobojonov

unread,
May 31, 2022, 6:27:00 AM5/31/22
to pyspider-users
hi , i have error with pyspider in
(from pyspider.libs.base_handler import *),
ModuleNotFoundError: No module named 'pyspider.libs'; 'pyspider' is not a package, how i solve this problem



вторник, 18 августа 2020 г. в 17:47:23 UTC-4, efca...@gmail.com:
Reply all
Reply to author
Forward
0 new messages