Abotx Javascript Rendering - Crawling multiple sites concurrently

84 views
Skip to first unread message

vij...@gmail.com

unread,
Feb 14, 2020, 3:50:33 AM2/14/20
to Abot Web Crawler
Hi,

I've tried to crawl a site with javascript rendering using abotx (30 day free trial) which works fine. 

Is it possible to crawl multiple sites concurrently with javascript rendering? I've tried to do it, but unfortunately I get invalid license. Is this happening because multiple threads trying to validate the license at the same time? Is there a way to fix this?

Invalid license.PNG


sjdi...@gmail.com

unread,
Feb 14, 2020, 7:49:40 PM2/14/20
to vij...@gmail.com, Abot Web Crawler
Hi, 

Be sure that you are using the AbotX2 (not AbotX) namespace? If you aren't then that is likely your issue. If you are then keep reading...

I was able to poke around based on your error and make a small change (nuget v2.1.3) that might help you but I'm unable to be sure without a reproducible scenario. Can you install v2.1.3 and see if that fixes it? If it doesn't fix it can you also share a code snippet that reproduces the error please? 

--
You received this message because you are subscribed to the Google Groups "Abot Web Crawler" group.
To unsubscribe from this group and stop receiving emails from it, send an email to abot-web-crawl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/abot-web-crawler/91e5a005-b82e-48c4-8af3-f14737b3a2c9%40googlegroups.com.

Vijai

unread,
Feb 15, 2020, 7:11:12 AM2/15/20
to sjdi...@gmail.com, Abot Web Crawler
Hi Steven,

Thanks for the quick fix :)  I don't get the invalid license anymore. Now I can crawl sites concurrently with javascript rendering which is awesome.

I really like Abotx and the flexibility it gives, but right now I'm having issues with some websites where the rendered html is not complete. 
I've tried to set various values to JavascriptRenderingWaitTimeInMilliseconds parameter, but still the rendered html is not complete when compared to electron (nightmarejs). 
Is it possible to use electron or any another jsrenderer other than phantomjs?

Here's the code sample that I'm trying out https://gist.github.com/vijai-durairaj/95245776df82f0307375fa25a7bc8a4a
The gist also contains some of the url's that doesn't get rendered properly.
--
Regards,
Vijai.

sjdi...@gmail.com

unread,
Feb 15, 2020, 12:19:00 PM2/15/20
to Vijai, Abot Web Crawler
Hi, 

Try setting the IsCookiesSendingEnabled to true. Sometimes sites need cookies sent back with each page request.

I'm working on a headless chrome implementation but that will take some time.
Message has been deleted

vij...@gmail.com

unread,
Feb 15, 2020, 1:23:54 PM2/15/20
to Abot Web Crawler
Unfortunately setting IsCookiesSendingEnabled didn't help. Do you have any other suggestions?

Thanks for your help. 
To unsubscribe from this group and stop receiving emails from it, send an email to abot-web...@googlegroups.com.


--
Regards,
Vijai.

sjdi...@gmail.com

unread,
Feb 15, 2020, 1:25:31 PM2/15/20
to Vijai, Abot Web Crawler
Could be many things. You could try some other headless browsers (chrome headless, selenium web driver, casper, etc..) to see if they give you the same result as AbotX/Phantomjs. Try google incognito with cookies on and off to see if they match what you see in headless browsers. Every site could be a different set of circumstances. Wish I could help more.   

On Sat, Feb 15, 2020 at 10:17 AM Vijai <vij...@gmail.com> wrote:
Unfortunately setting IsCookiesSendingEnabled doesn’t seem to help. Do you have any other suggestions?

Thanks for your help. 

--
Regards,
Vijai.
Reply all
Reply to author
Forward
0 new messages