Override functionality

44 views
Skip to first unread message

Vijay Verma

unread,
Mar 4, 2023, 6:40:13 AM3/4/23
to beautifulsoup
Hi, 

I need to override the get_text in order to handle the hyper links. Here is the function that I have:

class MyBeautifulSoup(BeautifulSoup):
def _all_strings(self, strip=False, types=(NavigableString, CData)):
for descendant in self.descendants:
# return "a" string representation if we encounter it

if isinstance(descendant, Tag) and descendant.name == 'a':
yield str('<{}> '.format(descendant.get('href', '')))

# skip an inner text node inside "a"
if isinstance(descendant, NavigableString) and descendant.parent.name == 'a':
continue

# default behavior
if (
(types is None and not isinstance(descendant, NavigableString))
or
(types is not None and type(descendant) not in types)):
continue

if strip:
descendant = descendant.strip()
if len(descendant) == 0:
continue
yield descendant


Here is the error that I am getting:
File "convert.py", line 28, in _all_strings
    (types is not None and type(descendant) not in types)):
TypeError: argument of type 'object' is not iterable

Any leads will be much appreciated!



Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages