Relative Selection Issue

79 views
Skip to first unread message

SHEHJAD ALI TAUS

unread,
Nov 19, 2021, 1:09:21 AM11/19/21
to Web Scraping
Can't select a parent element as a relative selection for a child element. 
Whenever I do that the page ends up in a "adding" loop screen and then I have to close parsehub and then enter but the selection is stll not possible.

Why? Anybody knows about this?


Andrew11

unread,
Nov 19, 2021, 1:25:51 AM11/19/21
to Web Scraping
I usually select the parent element first and then the child. It doesn't have to be relative select for the child. Let me know the URL and target element if you need more help.

SHEHJAD ALI TAUS

unread,
Nov 19, 2021, 1:35:50 AM11/19/21
to Web Scraping
url- https://dbcf.ms.gov/press-releases/
Here is want the json format to be - 
items [
    {
       title: "The title"
       link: "The link of the pdf"
    },
   {
       title: "The title"
       link: "The link of the pdf"
    },
]

I can't select the title and link here, 

Andrew11

unread,
Nov 19, 2021, 1:44:50 AM11/19/21
to Web Scraping
DBCF_project.zip

SHEHJAD ALI TAUS

unread,
Nov 19, 2021, 2:11:34 AM11/19/21
to Web Scraping
man, your project runs perfectly, 
but when I do it on my own its not the same. Can you tell me the whole detailed process of how you did it??

Thanks man for helping, Really appreciate it

Andrew11

unread,
Nov 19, 2021, 2:41:14 AM11/19/21
to Web Scraping
Sure! You have to know CSS & HTML though. If you hit CTRL-B to enter Browse mode, you can right-click page elements and choose Inspect Element. I used their class attribute to make selectors which you can enter with a dot in front of them after choosing Edit under Selection Node on the left hand pane in ParseHub (it shows up after making a non-empty ParseHub selection). Make sure Rooted Selection isn't checked in the sub-selects.

SHEHJAD ALI TAUS

unread,
Nov 19, 2021, 4:14:30 AM11/19/21
to Web Scraping
Brother, I am not getting it on my own. I know html, css but new to parsehub. Please offer a step by step process from start to finish

SHEHJAD ALI TAUS

unread,
Nov 19, 2021, 4:33:47 AM11/19/21
to Web Scraping
Step by step process of how you actually did it,

Andrew11

unread,
Nov 19, 2021, 5:35:36 AM11/19/21
to Web Scraping
Not sure how else to describe it... maybe tell me what yours looks like and I'll describe how to change it? I never use relative select and do everything with normal selects nested and CSS.

Andrew11

unread,
Nov 19, 2021, 5:45:59 AM11/19/21
to Web Scraping
Sometimes there's little circular icons in the select statements that you have to click on to expand so you can see more literally what's going on.

SHEHJAD ALI TAUS

unread,
Nov 20, 2021, 1:06:24 PM11/20/21
to Web Scraping
Hey andrew, 
Now I have a basic overview on Parsehub and can scrape websites with all the basic stuff,
can you answer this,  Do I need xpath knowledge to master parsehub?? How can i master parsehub to the core and get to know about all the advanced stuff?

Andrew11

unread,
Nov 20, 2021, 1:14:23 PM11/20/21
to Web Scraping
You can totally do it without XPath, but it offers some unique features:
*you can select elements that contain certain text (case-sensitive) //a[contains(text(), "test")]
*you can select following siblings of the target element, or previous siblings, or all ancestors more simply than with CSS + or ~ operators //a[contains(text(), "test")]/following-sibling::p
I also like the ability to refer to already-scraped values in a list: the ones in the current row can be accessed using the variable name, or you can refer to previous rows or the list itself using listname[0], or do a Loop command through the list using its variable name

Andrew11

unread,
Nov 20, 2021, 1:18:25 PM11/20/21
to Web Scraping
That list thing isn't XPath, just a cool advanced feature of PH.

Andrew11

unread,
Nov 20, 2021, 1:19:02 PM11/20/21
to Web Scraping
...and you use it in a Conditional or Extract statement.

Andrew11

unread,
Nov 20, 2021, 1:23:00 PM11/20/21
to Web Scraping
ps: some page elements (like iframes or sub-elements inside one) can only be selected using ParseHub point and click selection... CSS and I think XPath don't work in there unless they're a sub-select of a ParseHub select, with Rooted Selection unchecked.

Andrew11

unread,
Nov 20, 2021, 1:34:06 PM11/20/21
to Web Scraping
Last idea: you can often use JavaScript functions in Conditionals or Extract statements, such as test.replace(/asdf/gi,"fdsa")

SHEHJAD ALI TAUS

unread,
Nov 21, 2021, 1:14:25 AM11/21/21
to Web Scraping
Thanks Brother, for all these suggestions
You made all of my confusions go away.

Andrew11

unread,
Nov 21, 2021, 1:24:43 AM11/21/21
to Web Scraping
: )
Reply all
Reply to author
Forward
0 new messages