Extracting Images that are lightbox popups

500 views
Skip to first unread message

Mark Willett

unread,
Dec 18, 2014, 4:55:20 PM12/18/14
to web-s...@googlegroups.com
Hi,

I am scraping a site where the main image on the product page opens up a lightbox popup.  I want to collect the image prior to the lightbox popup and after the image pops.

Tried different methods and getting nowhere fast.

Any help appreciated on which selector(s) to use.

Thanks very much.

Mark.

Scott

unread,
Dec 19, 2014, 3:02:09 PM12/19/14
to web-s...@googlegroups.com
Please post the sitemap.

Mark Willett

unread,
Dec 22, 2014, 6:07:14 PM12/22/14
to web-s...@googlegroups.com
Hi is the map - appreciate the help.

{"startUrl":"http://www.leesgolddiamond.com/anklets.html","selectors":[{"parentSelectors":["_root"],"type":"SelectorLink","multiple":true,"id":"productlink","selector":"h2.product-name a","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"title","selector":"h1","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"catnumber","selector":"tr.first:contains('Catalog Number') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"type","selector":"tr.odd:contains('Material Type') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"color","selector":"tr.even:contains('Material Color') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"weight","selector":"tr.odd:contains('Avg. Weight') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"clasp","selector":"tr.even:contains('Clasp Type') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"length","selector":"tr.last:contains('Length') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorLink","multiple":false,"id":"imagelink","selector":"p.product-image a.lightbox-group","delay":""},{"parentSelectors":["imagelink"],"type":"SelectorImage","multiple":false,"id":"image","selector":"img.cboxPhoto","downloadImage":true,"delay":""}],"_id":"leesgold-anklets"}

Mārtiņš Balodis

unread,
Dec 23, 2014, 1:24:59 PM12/23/14
to Mark Willett, web-scraper
Hi,
For images you should use image selector. Image selector can also download images. Here is your sitemap with image selector:

{"selectors":[{"parentSelectors":["_root"],"type":"SelectorLink","multiple":true,"id":"productlink","selector":"h2.product-name a","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"title","selector":"h1","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"catnumber","selector":"tr.first:contains('Catalog Number') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"type","selector":"tr.odd:contains('Material Type') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"color","selector":"tr.even:contains('Material Color') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"weight","selector":"tr.odd:contains('Avg. Weight') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"clasp","selector":"tr.even:contains('Clasp Type') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorText","multiple":false,"id":"length","selector":"tr.last:contains('Length') td.data","regex":"","delay":""},{"parentSelectors":["productlink"],"type":"SelectorImage","multiple":false,"id":"imagelink","selector":"a.cloud-zoom img","delay":"","downloadImage":true},{"parentSelectors":["imagelink"],"type":"SelectorImage","multiple":false,"id":"image","selector":"img.cboxPhoto","downloadImage":true,"delay":""}],"startUrl":"http://www.leesgolddiamond.com/anklets.html","_id":"leesgold-anklets"}

--
You received this message because you are subscribed to the Google Groups "Web Scraper" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web-scraper...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages