Trying to dig some relevant code,
So I've started by running through this tutorial:
http://blog.dispatched.ch/2009/03/15/webscraping-with-python-and-beautifulsoup/
Yet, first of all not so sure it runs nicely with beautifulsoup4
"fingers crossed." If not I will have to just run it in python 2.7 -
single tear*.
Either way I'm attempting to get the example below to work.
I've only edited the import function as mentioned w/beautifulsoup4
documentation.
Help is always appreciated and I'm hoping to learn as much as I can
before I ask questions, sorry if my awareness on the questions I
should be asking is lacking.
Thank you! *see below.
This is my error:
File "webscraping_demo.py", line 36
print "%s => %s" % (translation[0], translation[1])
^
This is the code:
########################################################
# Alain M. Lafon, 15.03.2009
#
pree...@gmail.com
#
# HTML scraping with Python and BeautifulSoup
#
http://dispatched.ch
########################################################
import urllib
import urllib2
import string
import sys
from bs4 import BeautifulSoup
user_agent = 'Mozilla/5 (Solaris 10) Gecko'
headers = { 'User-Agent' : user_agent }
values = {'s' : sys.argv[1] }
data = urllib.urlencode(values)
request = urllib2.Request("
http://www.dict.cc/", data, headers)
response = urllib2.urlopen(request)
the_page = response.read()
pool = BeautifulSoup(the_page)
results = pool.findAll('td', attrs={'class' : 'td7nl'})
source = ''
translations = []
for result in results:
word = ''
for tmp in result.findAll(text=True):
word = word + " " + unicode(tmp).encode("utf-8")
if source == '':
source = word
else:
translations.append((source, word))
for translation in translations:
print "%s => %s" % (translation[0], translation[1])