depth property for Tag, NavigableString

226 views
Skip to first unread message

stephen lukacs

unread,
Mar 10, 2021, 8:13:48 PM3/10/21
to beautifulsoup

working heavily with BeautifulSoup 4.9.3 under python 3.9.2. anyway, is there a property for each Tag or NavigableString where we can get the depth or level of the Tag, for instance? like html would be level 0 and head and body would be 1, and the title or first div would be 2, and so forth. would really help ground the computations if that were a property. and the depth property would also be automatically updated if the tree were modified, like by wrap or unwrap, etc.

if the depth property is there and i'm not aware of it, please share. if not, please add this property to a future version. thank you in advance and have a good day. lucas

Vera Olsson

unread,
Mar 11, 2021, 12:54:08 PM3/11/21
to beautifulsoup
Hey

I think what you are looking for is similar to an XPath generator for an arbitrary element, like in this gist https://gist.github.com/ergoithz/6cf043e3fdedd1b94fcf You could rework this to only return the "depth" and maybe even make it available for a Tag / NavigableString.

stephen lukacs

unread,
Mar 11, 2021, 5:26:48 PM3/11/21
to beautifulsoup

ok, so I've been thinking about this and if its a get, read, only property on an object, then it would be easy to add the code to calculate the depth on the fly by simply using the .parents property for the Tag or NavigableString.  for instance, [e.name for e in tag.parents] would return ['body', 'html', '[document]'] and thus the len()-1 of that would yield the depth.

wherein the depth method would be something like:

def get_depth(__self__):

    return len(__self__.parents)-1

wherein html.depth = 1, head.depth = 2, body.depth = 2, title.depth = 3, div.depth = 3, and so forth.  and since the method is computed on the fly, any modification of the tree, i.e., wrap or unwrap, would be automatically updated.

what do you'all think?  Lucas

stephen lukacs

unread,
Mar 11, 2021, 5:41:23 PM3/11/21
to beautifulsoup
sorry, i meant something more like:

def get_depth(__self__):

    return len([t.name for t in __self__.parents])

should return the proper proposed depths.

Yo Okay

unread,
Mar 12, 2021, 2:38:51 PM3/12/21
to beauti...@googlegroups.com
On y conna Rien 

--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beautifulsou...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beautifulsoup/6b2fbad4-ea8d-4d95-9b05-c4e9769fe5a1n%40googlegroups.com.

facelessuser

unread,
Mar 13, 2021, 4:23:48 PM3/13/21
to beautifulsoup

It’s probably quicker to just do:

len(list(el.parents))

It’s also pretty trivial enough that I’m not sure it needs a property, but if one was added, I’d imagine the above would be quicker.

stephen lukacs

unread,
Mar 26, 2021, 5:49:17 PM3/26/21
to beautifulsoup
yes, agreed, I did create my own function to calculate it.  would be nice as a property in bs4.  thank you much.
Reply all
Reply to author
Forward
0 new messages