Scraping details from pages that are not consistent

24 views
Skip to first unread message

Paul Kelly

unread,
Jan 1, 2025, 10:42:05 AMJan 1
to Web Scraping
Hi.

I am reading data from pages where the data is not consistent.

ie. If there is nothing for that field, the page does not show a blank, it simply leaves it out altogether.

See https://vcgca.org/our-people/profile/2 and https://vcgca.org/our-people/profile/1670.

One has a field DOD and the other one does not.

I have tried selecting using labels and CONDITIONALS, but that does not help.
Instead of looking for the label, it seems to be looking at the page position.

I have attached examples of what I am seeing.

Any ideas how to deal with this?

ThankProfile 1670.png youProfile 2.png



Profile 2.png
Profile 1670.png

Andrew11

unread,
Jan 2, 2025, 6:08:07 PMJan 2
to Web Scraping
I usually use Select with CSS for things like this. Try one with this to select the personal info box overall:

div:has(> h3:contains("Personal info"))

Make sure Wait for load and Root selection are checked. Then nest (indent) another Select inside that, saying for example:

li:has(> strong:contains("Name:"))

Make sure Root selection is UNCHECKED. If it's something that isn't necessarily on every page, you can uncheck Wait for load in order to save time, or reduce the value to 10 seconds or something like that. And if you want to just extract the text without the label "Name:", nest another Select in the 2nd one, saying:

.text-right

And Root selection unchecked again. A similar strategy requires less specificity about label names, but gives you less columnar control. Instead of the inner Select's used above, make the 2nd one 

> ul li

and then inside that one, nest (with these two aligned vertically to each other) Selects for

> strong {{add a nested Extract "Label"}}

> .text-right  {{add an Extract "Value"}}

Let me know if any problems! 

Sauban Abdullahi

unread,
Sep 5, 2025, 11:45:39 AMSep 5
to Web Scraping
  Hi there! 👋 I am a skilled Python developer offering custom automation, Telegram bots, web scraping, and Telegram scraping services tailored to your business needs. My Services Include: 🤖Telegram Bots – automated bots for groups, channels, and businesses 🌐Web Scraping – structured, accurate data extraction from any website 📊Telegram Scraper – collect users & insights for growth and engagement ⚡Web Automation – streamline repetitive tasks and save valuable time Why work with me? ✔️ Clean, reliable, and scalable code✔️ On-time delivery ✔️ Quick support & revisions✔️ 100% satisfaction guaranteed 📩 Contact me today and let’s bring your automation ideas to life!🔹Catchy Social Media Ad 🚀Automate Your Work – Save Time, Grow Faster!🚀Tired of doing everything manually? I build: ✅ Powerful Telegram Bots✅ Smart Web Scrapers ✅ Accurate Telegram Scrapers✅ Time-saving Web Automation Tools ⚡ Boost efficiency, save hours of work, and get results instantly! 💻 Let’s make your life easier with Python automation. 📩 DM me today to get started https://t.me/Abdullahi_scrapee

Reply all
Reply to author
Forward
0 new messages