Yes, it turns out that they are the cause. If I uncomment the debug print() statement in the __getattr__() method and move it here:
def __getattr__(self, tag):
"""Calling tag.subtag is the same as calling tag.find(name="subtag")"""
if len(tag) > 3 and tag.endswith('Tag'):
# BS3: soup.aTag -> "soup.find("a")
tag_name = tag[:-3]
warnings.warn(
'.%(name)sTag is deprecated, use .find("%(name)s") instead. If you really were looking for a tag called %(name)sTag, use .find("%(name)sTag")' % dict(
name=tag_name
),
DeprecationWarning, stacklevel=2
)
return self.find(tag_name)
# We special case contents to avoid recursion.
elif not tag.startswith("__") and not tag == "contents":
print("Getattr %s.%s" % (self.__class__, tag))
return self.find(tag)
raise AttributeError(
"'%s' object has no attribute '%s'" % (self.__class__, tag))
then I see output like this:
Getattr <class 'bs4.element.Tag'>.sourceline
Getattr <class 'bs4.element.Tag'>.sourcepos
Getattr <class 'bs4.element.Tag'>.sourceline
Getattr <class 'bs4.element.Tag'>.sourcepos
Getattr <class 'bs4.element.Tag'>.sourceline
Getattr <class 'bs4.element.Tag'>.sourcepos
Getattr <class 'bs4.element.Tag'>.sourceline
Getattr <class 'bs4.element.Tag'>.sourcepos
so I think the runtime is coming from the SoupStrainer construction for that find().
Now I will file the issue. :)
- Chris