Beautiful Soup 4.14.3

11 views
Skip to first unread message

leonardr

unread,
Nov 30, 2025, 10:15:54 AM (3 days ago) Nov 30
to beautifulsoup
= 4.14.3 (20251130)

* When using one of the lxml tree builders, you can pass in
  huge_tree=True to disable lxml's security restrictions and process
  files that include huge text nodes. ("huge" means more than
  10,000,000 bytes of text in a single node). Without this, lxml may
  silently stop processing the file after encountering a huge text
  node. [bug=2072424]

* The html.parser tree builder processes numeric character entities
  using the algorithm described in the HTML spec. If this means
  replacing some other character with REPLACEMENT CHARACTER, it will
  set BeautifulSoup.contains_replacement_characters. [bug=2126753]

  The other tree builders rely on the underlying parser to do this
  sort of replacement. That means that Beautiful Soup never sees the
  original character reference, so it doesn't know whether
  REPLACEMENT_CHARACTER was the original content; therefore
  the html.parser tree builder will set contains_replacement_characters in
  situations where the other tree builders won't.

* Added a general test of the html.parser tree builder's ability to
  turn any parsing exception from html.parser into a
  ParserRejectedMarkup exception. This makes it possible to remove
  version-dependent tests that depended on the existence of specific
  bugs in html.parser. [bug=2121335,2121335]

Chris Papademetrious

unread,
Nov 30, 2025, 3:16:48 PM (3 days ago) Nov 30
to beautifulsoup
Thank you Leonard! It is great to see Beautiful Soup continue to chug along and crank out the improvements release after release.

 - Chris
Reply all
Reply to author
Forward
0 new messages