Hi Jonathan,
The easiest way to save data to SQLite with scraperwiki-python is to use
scraperwiki.sql.save()
If you look at the documentation here:
https://github.com/scraperwiki/scraperwiki-python - you can see that it
works on Python dictionaries.
So, you need to construct a dictionary for each data entry you wish to
create from your scraped data.
A single dictionary represents a row in the database, giving you data
for one particular thing. Each dictionary key represents a database
column name and the dictionary value corresponding to that key is the
actual data that will be stored in the column of that row.
Example:
import scraperwiki
example_data = {'product_url': '
http://some.product.url',
'product_name': 'Reticulated Fizztron',
'price': 199.99,
'manufacturer': 'Fizztron Industries'}
scraperwiki.sql.save(['product_url'], example_data)
scraperwiki.sql.save() takes a list of unique keys - the column(s) that
will be unique for each entry - here, in this example, my assumption is
that each product URL represents one *and only one* product. (We don't
have multiple products on one page, for instance). Depending on your
data, you may need to specify a combination of unique keys.
Here, example_data is just one dictionary, but you can also replace that
with a list of dictionaries to save them all in one
scraperwiki.sql.save() call.
Another thing worth mentioning as it's not immediately apparent: Python
dictionaries are ordered arbitrarily (albeit deterministically), which
results in the database columns created by scraperwiki.sql.save() having
an arbitrary order.
If you prefer to have a specific column order, you can use a Python
OrderedDict instead. This behaves a lot like a standard dictionary, but
the order in which you add keys is retained:
from collections import OrderedDict
import scraperwiki
example_data = OrderedDict()
example_data['product_url'] = '
http://some.product.url'
example_data['product_name'] = 'Reticulated Fizztron'
example_data['price'] = 199.99
example_data['manufacturer'] = 'Fizztron Industries'
scraperwiki.sql.save(['product_url'], example_data)
This doesn't matter too much, but can make the data easier to read if
you're looking at the tables by eye.
Hope that helps,
Steve