Read-type vs write-type applications

8 views
Skip to first unread message

Leonard Richardson

unread,
May 20, 2013, 3:41:14 PM5/20/13
to beautifulsoup
I see two basic kinds of application built on top of Beautiful Soup:

Read-type: Read a document and extract bits of data from it. (The
classic screen-scrape.)
Write-type: Read a document, modify it, and write it back out.

If you make write-type applications, I would like to hear from you
about why you chose Beautiful Soup. Beautiful Soup was originally
designed for read-type applications, and IMO is still geared towards
read-type applications. I understand why you might choose Beautiful
Soup over its "competitors" (lxml, scrapy, pyquery) for a read-type
application, and I understand why you might choose the "competitors"
instead.

But I don't understand the considerations that apply to write-type
applications. I personally don't care about those applications, and
pretty much the entire API for manipulating the tree was added by user
request, one API call at a time. So I'm baffled by the fact that over
the past couple of years the percentage of write-type applications
seems to have gone way up. I can imagine two explanations for the
phenomenon, not mutually exclusive:

1. The phenomenon is an illusion. Read-type applications are as common
as they used to be, but the Beautiful Soup documentation is good
enough, the API is complete enough, and the code works well enough
that people don't ask as many questions about it. They just write the
code. When it comes to write-type applications, there are lots of
outstanding issues and obvious questions. So write-type applications
are more visible.

2. For read-type applications, Beautiful Soup has lost a lot of market
share as "competitors" emerged. But for write-type applications,
Beautiful Soup still has a clear advantage.

If you feel #2 is true, I'd like to hear what that advantage is. That
way I can try to make decisions about the Beautiful Soup API based on
a coherent philosophy rather than in response to one-off requests.
OTOH, If there's some other reason why you chose Beautiful Soup for a
write-type application, I'd like to hear it.

("Competitors" in scare quotes because I don't consider this a competition.)

Leonard

Jochen Voß

unread,
May 21, 2013, 5:33:19 AM5/21/13
to beauti...@googlegroups.com
Hi Leonard,

Not sure whether this helps, but here is the reason why I use Beautiful Soup for rewriting HTML files: I had heard somewhere that "Beautiful Soup is good for working with HTML", when I looked I found that it had good documentation, and when I tried to use it for my project (packing HTML apps) it mostly worked, except for messing up white-space in the output a bit.  This was good enough for me so I stuck with Beautiful Soup.  I did not try any other HTML handling libraries.

I hope this helps,
Jochen



Leonard

--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beautifulsou...@googlegroups.com.
To post to this group, send email to beauti...@googlegroups.com.
Visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.





--

Aron Griffis

unread,
May 21, 2013, 10:05:40 AM5/21/13
to beauti...@googlegroups.com
Hi Leonard,

I mostly use Beautiful Soup for scraping and reading.  Sometimes the next logical step--unforeseen from the outset--is manipulating the content.  I suspect this happens to a lot of people, but maybe it's just me. :-)

I also find myself interested in Beautiful Soup because it's used by other libraries, for example WebTest.  That's mostly a read capacity, but it ends up doing some manipulation as well, particularly for forms.

I think you have good point in explanation #1. The documentation is quite good, and the API/library mature enough for read-type applications, that you might not be seeing as many questions for that.  The questions are happening in the area where Beautiful Soup is weaker.  Have you considered replying "Beautiful Soup wasn't designed for that... Why don't you use X?"

In any case, thanks for Beautiful Soup. I've made a lot of use of it in various applications, and I especially appreciate your responsiveness to questions and bugs.

Regards,
Aron



Leonard

Reply all
Reply to author
Forward
0 new messages