Trying to understand example to dig the code

35 views
Skip to first unread message

willhelm

unread,
Feb 8, 2012, 1:34:32 PM2/8/12
to beautifulsoup
Trying to dig some relevant code,

So I've started by running through this tutorial:
http://blog.dispatched.ch/2009/03/15/webscraping-with-python-and-beautifulsoup/

Yet, first of all not so sure it runs nicely with beautifulsoup4
"fingers crossed." If not I will have to just run it in python 2.7 -
single tear*.

Either way I'm attempting to get the example below to work.

I've only edited the import function as mentioned w/beautifulsoup4
documentation.

Help is always appreciated and I'm hoping to learn as much as I can
before I ask questions, sorry if my awareness on the questions I
should be asking is lacking.

Thank you! *see below.

This is my error:

File "webscraping_demo.py", line 36
print "%s => %s" % (translation[0], translation[1])
^

This is the code:

########################################################
# Alain M. Lafon, 15.03.2009
# pree...@gmail.com
#
# HTML scraping with Python and BeautifulSoup
# http://dispatched.ch
########################################################


import urllib
import urllib2
import string
import sys
from bs4 import BeautifulSoup
user_agent = 'Mozilla/5 (Solaris 10) Gecko'
headers = { 'User-Agent' : user_agent }
values = {'s' : sys.argv[1] }
data = urllib.urlencode(values)
request = urllib2.Request("http://www.dict.cc/", data, headers)
response = urllib2.urlopen(request)
the_page = response.read()
pool = BeautifulSoup(the_page)
results = pool.findAll('td', attrs={'class' : 'td7nl'})
source = ''
translations = []

for result in results:
word = ''
for tmp in result.findAll(text=True):
word = word + " " + unicode(tmp).encode("utf-8")
if source == '':
source = word
else:
translations.append((source, word))
for translation in translations:
print "%s => %s" % (translation[0], translation[1])

Leonard Richardson

unread,
Feb 8, 2012, 1:44:46 PM2/8/12
to beauti...@googlegroups.com
What error are you getting? This script gives results when I run it.

Leonard

> --
> You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
> To post to this group, send email to beauti...@googlegroups.com.
> To unsubscribe from this group, send email to beautifulsou...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
>

willhelm

unread,
Feb 8, 2012, 11:37:56 PM2/8/12
to beautifulsoup
File "webscraping_demo.py", line 36
print "%s => %s" % (translation[0], translation[1])
^


On Feb 8, 1:44 pm, Leonard Richardson <leona...@segfault.org> wrote:
> What error are you getting? This script gives results when I run it.
>
> Leonard
>
>
>
>
>
>
>
> On Wed, Feb 8, 2012 at 1:34 PM, willhelm <willim...@gmail.com> wrote:
> > Trying to dig some relevant code,
>
> > So I've started by running through this tutorial:
> >http://blog.dispatched.ch/2009/03/15/webscraping-with-python-and-beau...
>
> > Yet, first of all not so sure it runs nicely with beautifulsoup4
> > "fingers crossed." If not I will have to just run it in python 2.7 -
> > single tear*.
>
> > Either way I'm attempting to get the example below to work.
>
> > I've only edited the import function as mentioned w/beautifulsoup4
> > documentation.
>
> > Help is always appreciated and I'm hoping to learn as much as I can
> > before I ask questions, sorry if my awareness on the questions I
> > should be asking is lacking.
>
> > Thank you! *see below.
>
> > This is my error:
>
> > File "webscraping_demo.py", line 36
> >    print "%s => %s" % (translation[0], translation[1])
> >                   ^
>
> > This is the code:
>
> > ########################################################
> > # Alain M. Lafon, 15.03.2009
> > # preek....@gmail.com
> > #
> > # HTML scraping with Python and BeautifulSoup
> > #http://dispatched.ch

willhelm

unread,
Feb 9, 2012, 12:08:21 AM2/9/12
to beautifulsoup
ok i got it working.

i'm a newbie to the max and forgot the simple: python demo.py candy

willhelm

unread,
Feb 9, 2012, 12:09:07 AM2/9/12
to beautifulsoup
In other news am I in the correct place if I am trying to scrape
customer review data from amazon for an art project?

Thank you for making a kick ass library!!

w

On Feb 8, 11:37 pm, willhelm <willim...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages