I can't reproduce your output exactly without seeing your code and markup, but I'm guessing it was something like this:
soup = BeautifulSoup("<!doctype html><html>some content</html> ")
There are three top-level things in this HTML document: a document type definition, an <html> tag, and a string at the end with some extra white space. When the BeautifulSoup object is created, an object is created for each of those top-level items, and those become the .contents of the BeautifulSoup object:
# <!DOCTYPE html>
# <html><body><p>some content</p></body></html>
print([type(x) for x in soup.children])
# [<class 'bs4.element.Doctype'>, <class 'bs4.element.Tag'>, <class 'bs4.element.NavigableString'>]
print([x.name for x in soup.children])
# [None, 'html', None]
The different types of things that might show up in an HTML document are given different Python classes because they play different roles in the document. The part of the documentation that covers this is Kinds of Objects
. There are two main classes: Tag
for tags and NavigableString for strings.
The more obscure classes like Doctype
are subclasses of NavigableString
, and they're covered (briefly) in Comments and other special strings
Since there's only one top-level HTML tag in the document, there's only one Tag object in soup.contents. Tags are the only objects that can have a .name. The other classes like NavigableString do implement .name, to avoid crashes when iterating over a mixed list like soup.contents, but for anything other than a Tag, .name is always defined to be None. That's why only one thing showed up in your list of names.