extracting links

12 views
Skip to first unread message

Ben Edwards

unread,
Aug 7, 2017, 3:23:55 PM8/7/17
to nokogiri-talk
Hi, ive done a handfull of scrapers and this one has got me stumped.

main_page = Nokogiri::HTML( HTTParty.get('https://www.thelouisiana.net/tickets') )

than

main_page.css('.content').css('.EventListingListItem-performances').map { |link| p link }

gives me

 #(Element:0x1584750 {
   name = "div",
   attributes = [ #(Attr:0x15846ec { name = "class", value = "EventListingListItem-performances" })],
   children = [
     #(Text "\n      "),
     #(Element:0x15841d8 {
       name = "a",
       attributes = [ #(Attr:0x1584174 { name = "href", value = "/events/2017-08-12-highlives-the-louisiana" })],
       children = [ #(Text "\n        \n\n        \n          Highlives\n        \n      ")]
       }),
     #(Text "\n    ")]
   }),
 #(Element:0x1578464 {
   name = "div",
   attributes = [ #(Attr:0x1578400 { name = "class", value = "EventListingListItem-performances" })],
   children = [
     #(Text "\n      "),
     #(Element:0x1577e74 {
       name = "a",
       attributes = [ #(Attr:0x1577dfc { name = "href", value = "/events/2017-08-13-twin-arrow-the-louisiana" })],
       children = [ #(Text "\n        \n\n        \n          Twin Arrow\n        \n      ")]
       }),
     #(Text "\n    ")]
   }),
 .......


but

main_page.css('.content').css('.EventListingListItem-performances').map { |link| p link['href'] }

just gives me nil values

nil
nil
......
nil
nil
nil
=> [nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil]


Any ideas?

Ben

Aaron Patterson

unread,
Aug 7, 2017, 3:29:08 PM8/7/17
to nokogi...@googlegroups.com
Looks like you're selecting the div tags, not that a tags that are inside the div tags.  Maybe something like this (note I haven't tested):

main_page.css('.content').css('.EventListingListItem-performances a').map { |link| p link["href"] }


Ben Edwards

unread,
Aug 7, 2017, 7:42:12 PM8/7/17
to nokogiri-talk
Thanks, that was it

Reply all
Reply to author
Forward
0 new messages