some initial findings with the 4.13 branch

43 views
Skip to first unread message

Chris Papademetrious

unread,
Jan 20, 2024, 4:45:07 AMJan 20
to beautifulsoup
Hi Leonard,

I tried the 4.13 branch out on one of our content processing pipelines at my day job. It processes about 27k HTML files with <h1>/<h2>/etc. hierarchy and extracts about 80k HTML fragments from them.

The runtime results are:

4.12.2
586 seconds (average of three runs)

4.13
574 seconds (average of three runs)

Impressive - even with the extra type checking, 4.13 is a bit faster! I did follow-up testing to confirm.

The output HTML fragments generated by the pipeline are identical for both versions.

I hope this is helpful,

 - Chris


Reply all
Reply to author
Forward
0 new messages