Can't select multiple page links?

1,316 views
Skip to first unread message

vigovigo

unread,
May 30, 2016, 6:24:48 AM5/30/16
to Web Scraper
I am trying to define a selector in which there are multiple results pages (it is a directory site that I need to scrape). However, when I attempt to select multiple links to define this selector, there is no way to do it as before I can click DONE SELECTING for the selection, the link is followed in the browser. Has this happened to anyone? It seems to happen only on some pages for me. Any advice would be appreciated...


Screen Shot 2016-05-30 at 12.08.57 PM.png

Mārtiņš Balodis

unread,
May 30, 2016, 8:12:27 AM5/30/16
to vigovigo, Web Scraper
Hi,
That means that the pagination isn't made of links. Instead it is made of buttons. If clicking these pagination buttons the page isn't reloaded then you should try using element click selector. If the page is reloaded then web scraper cannot handle these kind of buttons at this moment.

On Mon, May 30, 2016 at 1:24 PM, vigovigo <joaqui...@gmail.com> wrote:
I am trying to define a selector in which there are multiple results pages (it is a directory site that I need to scrape). However, when I attempt to select multiple links to define this selector, there is no way to do it as before I can click DONE SELECTING for the selection, the link is followed in the browser. Has this happened to anyone? It seems to happen only on some pages for me. Any advice would be appreciated...


--
You received this message because you are subscribed to the Google Groups "Web Scraper" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web-scraper...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

vigovigo

unread,
May 30, 2016, 8:33:17 AM5/30/16
to Web Scraper, joaqui...@gmail.com

Thanks for the reply, Mārtiņš. The problem is that when I try and use the link type, I click on the buttons and the browser follows the link. If I use the element type to select, in this case the result is that the selector is an element defined as: ul.m-results-pagination a. When I run the scrape, nothing happens, ie the scrape stops on the first attempt. If I change it to link type, using the same selector ul.m-results-pagination a, it only scrapes the pages that are linked to from the initial page. In other words, where the pagination might show  1 2 3 4 5 6 7 8 9 10 >> it only scrapes from the 1-10 and does not continue to >>. 

Now, if I want to add manuall, can I add the elements in this way, separating with commas?:
li.is-active.first a, ul.m-results-pagination a, ul.m-results-pagination a li.last a

Thanks a million!

Youssef Alaoui

unread,
May 31, 2016, 9:37:25 AM5/31/16
to Web Scraper, joaqui...@gmail.com
Hi,

You can try Jquery "Next Adjacent Selector": https://api.jquery.com/next-adjacent-selector/

So something like this: "li.is-active.first + li a"

joaquin bueno

unread,
May 31, 2016, 9:38:09 AM5/31/16
to Youssef Alaoui, Web Scraper

Thanks, I will try and report back!

Message has been deleted

Gar Lee

unread,
Jun 1, 2016, 5:20:39 PM6/1/16
to Web Scraper, joaqui...@gmail.com


Hi,


I'm running into a similar issue. I'm trying to run web scraper on this page: http://www.usyouthsoccer.org/ClubDirectorySearch/


but I'm having issues moving from page to page. I'm trying to grab all the data from the table on each page.


Additionally, I’m having trouble setting up the selectors for this page and having the results showing up in the same row.


http://www.calsouth.com/en/league-club-locator/?493[zip]=90002&493[range]=500&341[search][__fulltext]=&341[search][submit]=Submit+Search


I can’t seem to make the association between the columns. 


Thanks for your help!

Mārtiņš Balodis

unread,
Jun 7, 2016, 2:34:48 AM6/7/16
to Gar Lee, Web Scraper, Joaquin Bueno
Hi Gar Lee,
The usyouthsoccer.org site has links that do a form submit that reload the page. Web Scraper doesn't support this kind of pagination at this moment. 

In the calsouth.com site you need to use an element selector that will be used to separate records. Here is an example sitemap that will get you started: 

{"_id":"calsouth","startUrl":"http://www.calsouth.com/en/league-club-locator/?493[zip]=90002&493[range]=500&341[search][__fulltext]=&341[search][submit]=Submit+Search","selectors":[{"parentSelectors":["_root"],"type":"SelectorElement","multiple":true,"id":"item","selector":"tr.package:nth-of-type(n+2)","delay":""},{"parentSelectors":["item"],"type":"SelectorText","multiple":false,"id":"title","selector":"h3.clTitle a.iframe","regex":"","delay":""},{"parentSelectors":["item"],"type":"SelectorText","multiple":false,"id":"contact-name","selector":"div.clContactName","regex":"","delay":""}]}
Reply all
Reply to author
Forward
0 new messages