Exception thrown while scaping some sites

91 views
Skip to first unread message

Muthukumar K

unread,
May 12, 2014, 5:15:09 AM5/12/14
to zomb...@googlegroups.com


I have started using Zombie JS and could web scrap some simple web pages successfully.

But some web pages throw exception. So I am kind of stuck in scrapping the required page.
I have presented a very similar case here.

For example, when I scrap Google search page the following exception  is thrown.

JS Code:
var Browser = require("zombie");
var assert = require("assert");

browser = new Browser();

browser.visit("https://www.google.com/", {debug:false, runScripts: true}, function () {

  //submit a queryCommandEnabled
  browser.fill("q","tiger");
  browser.pressButton("btnG", function() {
  //console.log(browser.html());
  });
 

})

Exception:
Cannot read property 'value' of undefined TypeError: Cannot read property 'value' of undefined
    at Object.l [as Xf] (<anonymous>:256:367)
    at _.k.$ (<anonymous>:193:426)
    at new ni (<anonymous>:182:546)
    at Object.C.Ag (<anonymous>:315:499)
    at Object.a (<anonymous>:321:424)
    at https://www.google.co.in/search?hl=en-IN&source=hp&q=tiger&gbv=2&btnG=Google%20Search:script:1:22
    at Contextify.sandbox.run (C:\Users\185577\Downloads\NODEJS_SOFTWARE\test\node_modules\zombie\node_modules\jsdom\node_modules\contextify\lib\contextify.js:12:24)
    at DOMWindow.window._evaluate (C:\Users\185577\Downloads\NODEJS_SOFTWARE\test\node_modules\zombie\lib\zombie\window.js:188:25)
    at Object.HTML.languageProcessors.javascript (C:\Users\185577\Downloads\NODEJS_SOFTWARE\test\node_modules\zombie\lib\zombie\scripts.js:23:21)
    at define.proto._eval (C:\Users\185577\Downloads\NODEJS_SOFTWARE\test\node_modules\zombie\node_modules\jsdom\lib\jsdom\level2\html.js:1480:47)

Can some one explain how to resolve this error or  am I missing something.

Thanks
Muthukumar

Assaf Arkin

unread,
May 12, 2014, 11:35:50 AM5/12/14
to Muthukumar K, zomb...@googlegroups.com
FYI Zombie.js is designed for web app testing, not for scraping.


--
You received this message because you are subscribed to the Google Groups "zombie.js" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zombie-js+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ngoc Dao

unread,
May 12, 2014, 5:01:10 PM5/12/14
to zomb...@googlegroups.com, Muthukumar K
FYI Zombie.js is designed for web app testing, not for scraping.

What does it imply?

Assaf Arkin

unread,
May 13, 2014, 6:40:17 PM5/13/14
to zomb...@googlegroups.com
You're going to run into problems which Zombie is not designed to solve.

On Mon, May 12, 2014 at 2:01 PM, Ngoc Dao <ngocda...@gmail.com> wrote:

FYI Zombie.js is designed for web app testing, not for scraping.

What does it imply?

--

Matt Spendlove

unread,
Jun 23, 2014, 5:54:55 AM6/23/14
to zomb...@googlegroups.com
Ok, so this thread indicates I might have chosen the wrong tool!

I followed a tutorial that used Zombie as I need to do a one time scrape across a few URLs. It works fine, repeatedly for scraping a single URL. Problem arises as soon as I started to ask for 3/4 + which suggests some kind of concurrency problem. I get an error, which seems to indicate Im trying to walk the DOM before the page has fully loaded:

Cannot use 'in' operator to search for 'compareDocumentPosition' in null


I'm new to Node to maybe I've got something fundamentally wrong?
The entry point to my script is on line 98, and I thought I was telling Zombie to wait for pageLoad on line 40.

Would really appreciate any insight.

Thanks.
Reply all
Reply to author
Forward
0 new messages