Scrape from expanding list

108 views
Skip to first unread message

Mama Mama

unread,
Dec 21, 2021, 2:38:26 PM12/21/21
to Web Scraping
Hi,

I am trying to get data from this website but the challenge is that there is no page numbers and just the list expands as scroll down... tried to use scroll option in ParseHub but didn't work


Thanks

Andrew11

unread,
Dec 21, 2021, 3:49:38 PM12/21/21
to Web Scraping
This one needs a little hacking to get into, because ParseHub's browser again can't handle the JavaScript. If you go to
https://infrastructurepipeline.org/api/projects/1
and then scroll down in the box marked "Response body" you'll see an entry called "liveUrl" which gives you the address of the individual project page. My idea is that you can run a loop in ParseHub using $createArray(346) or the total number of projects, and then Go to template with the URL above, but delete the 1... put that in quotes and then after it, type ($index + 1). If all goes well it'll spit back some JSON which you can load using an Extract command under the main page entry in the new template, and set the dropdown list in the Extract command to JSON object. Then you can get the project URL and load it in Go to template the usual way. Kind of complicated I know. Let me know if you get confused!

Andrew11

unread,
Dec 21, 2021, 5:13:58 PM12/21/21
to Web Scraping
BTW, I noticed that the number of projects in the API is 346, but in the search page under Filters, it gives 451. Not sure how to explain that.
Reply all
Reply to author
Forward
0 new messages