Leonard Richardson
unread,May 20, 2013, 3:41:14 PM5/20/13Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beautifulsoup
I see two basic kinds of application built on top of Beautiful Soup:
Read-type: Read a document and extract bits of data from it. (The
classic screen-scrape.)
Write-type: Read a document, modify it, and write it back out.
If you make write-type applications, I would like to hear from you
about why you chose Beautiful Soup. Beautiful Soup was originally
designed for read-type applications, and IMO is still geared towards
read-type applications. I understand why you might choose Beautiful
Soup over its "competitors" (lxml, scrapy, pyquery) for a read-type
application, and I understand why you might choose the "competitors"
instead.
But I don't understand the considerations that apply to write-type
applications. I personally don't care about those applications, and
pretty much the entire API for manipulating the tree was added by user
request, one API call at a time. So I'm baffled by the fact that over
the past couple of years the percentage of write-type applications
seems to have gone way up. I can imagine two explanations for the
phenomenon, not mutually exclusive:
1. The phenomenon is an illusion. Read-type applications are as common
as they used to be, but the Beautiful Soup documentation is good
enough, the API is complete enough, and the code works well enough
that people don't ask as many questions about it. They just write the
code. When it comes to write-type applications, there are lots of
outstanding issues and obvious questions. So write-type applications
are more visible.
2. For read-type applications, Beautiful Soup has lost a lot of market
share as "competitors" emerged. But for write-type applications,
Beautiful Soup still has a clear advantage.
If you feel #2 is true, I'd like to hear what that advantage is. That
way I can try to make decisions about the Beautiful Soup API based on
a coherent philosophy rather than in response to one-off requests.
OTOH, If there's some other reason why you chose Beautiful Soup for a
write-type application, I'd like to hear it.
("Competitors" in scare quotes because I don't consider this a competition.)
Leonard