Collecting Rich Data Structures for students

kirby...@gmail.com

unread,

Jan 8, 2008, 9:19:11 PM1/8/08

to

Greetings Pythoneers --

Some of us over on edu-sig, one of the community actives,
have been brainstorming around this Rich Data Structures
idea, by which we mean Python data structures already
populated with non-trivial data about various topics such
as: periodic table (proton, neutron counts); Monty Python
skit titles; some set of cities (lat, long coordinates); types
of sushi.

Obviously some of these require levels of nesting, say
lists within dictionaries, more depth of required.

Our motivation in collecting these repositories is to give
students of Python more immediate access to meaningful
data, not just meaningful programs. Sometimes all it takes
to win converts, to computers in general, is to demonstrate
their capacity to handle gobs of data adroitly. Too often,
a textbook will only provide trivial examples, which in the
print medium is all that makes sense.

Some have offered XML repositories, which I can well
understand, but in this case we're looking specifically for
legal Python modules (py files), although they don't have
to be Latin-1 (e.g. the sushi types file might not have a
lot of romanji).

If you have any examples you'd like to email me about,
kirby...@gmail.com is a good address.

Here's my little contribution to the mix:
http://www.4dsolutions.net/ocn/python/gis.py

Kirby Urner
4D Solutions
Silicon Forest
Oregon

Paddy

unread,

Jan 9, 2008, 1:52:22 AM1/9/08

to

On Jan 9, 2:19 am, "kirby.ur...@gmail.com" <kirby.ur...@gmail.com>
wrote:

> Greetings Pythoneers --
>
> Some of us over on edu-sig, one of the community actives,
> have been brainstorming around this Rich Data Structures
> idea, by which we mean Python data structures already
> populated with non-trivial data about various topics such
> as: periodic table (proton, neutron counts); Monty Python
> skit titles; some set of cities (lat, long coordinates); types
> of sushi.
>
> Obviously some of these require levels of nesting, say
> lists within dictionaries, more depth of required.
>
> Our motivation in collecting these repositories is to give
> students of Python more immediate access to meaningful
> data, not just meaningful programs. Sometimes all it takes
> to win converts, to computers in general, is to demonstrate
> their capacity to handle gobs of data adroitly. Too often,
> a textbook will only provide trivial examples, which in the
> print medium is all that makes sense.
>
> Some have offered XML repositories, which I can well
> understand, but in this case we're looking specifically for
> legal Python modules (py files), although they don't have
> to be Latin-1 (e.g. the sushi types file might not have a
> lot of romanji).
>
> If you have any examples you'd like to email me about,

> kirby.ur...@gmail.com is a good address.

>
> Here's my little contribution to the mix:http://www.4dsolutions.net/ocn/python/gis.py
>
> Kirby Urner
> 4D Solutions
> Silicon Forest
> Oregon

I would think there was more data out there formatted as Lisp S-
expressions than Python data-structures.
Wouldn't it be better to concentrate on 'wrapping' XML and CSV data-
sources?

- Paddy.

Martin Marcher

unread,

Jan 9, 2008, 4:02:57 AM1/9/08

to pytho...@python.org

Paddy wrote:

> On Jan 9, 2:19 am, "kirby.ur...@gmail.com" <kirby.ur...@gmail.com>
> wrote:
>> Some have offered XML repositories, which I can well
>> understand, but in this case we're looking specifically for
>> legal Python modules (py files), although they don't have
>> to be Latin-1 (e.g. the sushi types file might not have a
>> lot of romanji).

Are you asking for

class SushiList(object):
types = [sushi1, sushi2, sushi3, ...]

I don't quite get that, any reference to the original discussion?

/martin

--
http://noneisyours.marcher.name
http://feeds.feedburner.com/NoneIsYours

You are not free to read this message,
by doing so, you have violated my licence
and are required to urinate publicly. Thank you.

bearoph...@lycos.com

unread,

Jan 9, 2008, 7:39:17 AM1/9/08

to

It may be better to keep the data in a simpler form:

data = """\
42 40 73 45 Albany, N.Y.
35 5 106 39 Albuquerque, N.M.
35 11 101 50 Amarillo, Tex.
34 14 77 57 Wilmington, N.C.
49 54 97 7 Winnipeg, Man., Can."""

cities = {}
for line in data.splitlines():
a1, a2, a3, a4, n = line.split(" ", 4)
cities[n] = [(int(a1), int(a2), "N"), (int(a3), int(a4), "W")]
print cities

Bye,
bearophile

Fredrik Lundh

unread,

Jan 9, 2008, 7:47:41 AM1/9/08

to pytho...@python.org

kirby...@gmail.com wrote:

> Some have offered XML repositories, which I can well
> understand, but in this case we're looking specifically for
> legal Python modules (py files), although they don't have
> to be Latin-1 (e.g. the sushi types file might not have a
> lot of romanji).

you can of course convert any XML file to legal Python code simply by
prepending

from xml.etree.ElementTree import XML
data = XML("""

and appending

""")

and then using the ET API to navigate the data, but I guess that's not
what you had in mind.

</F>

Paddy

unread,

Jan 9, 2008, 11:15:59 AM1/9/08

to

The more I think on it the more I am against this- data should be
stored in programming language agnostic forms but which are easily
made available to a large range of programming languages.
If the format is easily parsed by AWK then it is usually easy to parse
in a range of programming languages.

- Paddy.

kirby...@gmail.com

unread,

Jan 9, 2008, 6:05:25 PM1/9/08

to

It's OK to be against it, but as many have pointed out, it's often
just one value adding step to go from plaintext or XML to something
specifically Python.

Sometimes we spare the students (whomever they may be) this added
step and just hand them a dictionary of lists or whatever. We
may not be teaching parsing in this class, but chemistry, and
having the info in the Periodic Table in a Python data structure
maybe simply be the most relevant place to start.

Many lesson plans I've seen or am working on will use these .py
data modules.

Kirby

Scott David Daniels

unread,

Jan 10, 2008, 12:55:46 AM1/10/08

to

kirby...@gmail.com wrote:
> Some of us over on edu-sig, one of the community actives,
> have been brainstorming around this Rich Data Structures
> idea, by which we mean Python data structures already
> populated with non-trivial data about various topics such
> as: periodic table (proton, neutron counts); Monty Python
> skit titles; some set of cities (lat, long coordinates); types
> of sushi.

Look into the "Stanford GraphBase" at:
http://www-cs-faculty.stanford.edu/~knuth/sgb.html
A great source of some data with some interesting related
exercises.

Also, a few screen-scraping programs that suck _current_
information from some sources should also delight; the students
have a shot at getting ahead of the teacher.

--Scott David Daniels
Scott....@Acm.Org

Paddy

unread,

Jan 10, 2008, 1:03:42 AM1/10/08

to

On Jan 9, 11:05 pm, "kirby.ur...@gmail.com" <kirby.ur...@gmail.com>
wrote:

Then I'd favour the simple wrappings of bearophile and Frederik Lundhs
replies where it is easy to extract the original datamaybe for
updating , or for use in another language.

- Paddy.

Message has been deleted

kirby...@gmail.com

unread,

Jan 10, 2008, 1:29:30 PM1/10/08

to

On Jan 10, 1:01 am, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> On Wed, 9 Jan 2008 15:05:25 -0800 (PST), "kirby.ur...@gmail.com"
> <kirby.ur...@gmail.com> declaimed the following in comp.lang.python:

>
> > Sometimes we spare the students (whomever they may be) this added
> > step and just hand them a dictionary of lists or whatever. We
> > may not be teaching parsing in this class, but chemistry, and
> > having the info in the Periodic Table in a Pythondatastructure
> > maybe simply be the most relevant place to start.
>

> In this particular example, I'd probably suggest stuffing thedata
> into an SQLite3 database file... Searching on name, symbol, weight, etc.
> would be much easier then trying to dig through a nested dictionary.
>
> --
> Wulfraed Dennis Lee Bieber KD6MOG
> wlfr...@ix.netcom.com wulfr...@bestiaria.com
> HTTP://wlfraed.home.netcom.com/
> (Bestiaria Support Staff: web-a...@bestiaria.com)
> HTTP://www.bestiaria.com/

That's not a bad idea. We might see people passing ZODBs around
more too, as 'import zodb' in IDLE or whatever is increasingly
the style, vs. some megabundle you have to install. Think of
Zope as another site-package.

The advantage of just passing .py files around, among XO users
for example, is the periodicTable.py's contents are directly
eyeballable as ascii/unicode text, vs. stuffed into a wrapper.

I think what I'm getting from this fruitful discussion is the
different role of amalgamator-distributors, and Sayid or Kate
as classroom teachers, just trying to get on with the lesson
and having no time for computer science topics.

XML or YAML also make plenty of sense, for the more generic
distributor type operations.

Speaking only for myself, I appreciated some of the pointers
to APIs. Over on edu-sig, we've been talking a lot about
the 3rd party module for accessing imdb information -- not
a screen scraper.

Given xml-rpc, there's really no limit on the number of
lightweight APIs we might see. How about CIA World Factbook?
Too boring maybe, but it's already going out on the XOs, or
some of them, just because it's relatively up to date.
Could be imported as Python module too -- maybe that work
has already been done?

Kirby