Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Howto: extract a 'column' from a list of lists into a new list?

3 views
Skip to first unread message

Greg Brunet

unread,
Jun 30, 2003, 8:44:20 PM6/30/03
to
I'm writing some routines for handling dBASE files. I've got a table
(DBF file) object & field object already defined, and after opening the
file, I can get the field info like this:

>>> tbl.Fields()
[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,
0), ('D-ACCRTL', 'C', 9, 0), ('D-ACCCST', 'C', 9, 0)]

What I would like to do is be able to extract the field names into a
single, separate list. It should look like:

['STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST']

but I'm not sure about how to do that. I can do this:
>>> for g in tbl.Fields(): print g[0]
...
STOCKNO
DACC
DEALERACCE
D-ACCRTL
D-ACCCST

but I expect that one of those fancy map/lamda/list comprehension
functions can turn this into a list for me, but, to be honest, they
still make my head spin trying to figure them out. Any ideas on how to
do this simply?


Even better yet... the reason I'm trying to do this is to make it easy
to refer to a field by the field name as well as the field number. I
expect to be reading all of the data into a list (1 per row/record) of
row objects. If someone wants to be able to refer to FldA in record 53,
then I'd like them to be able to use: "row[52].FldA" instead of having
to use "row[52][4]" (if it's the 5th field in the row). I was planning
on using the __getattr__ method in my row object like the following:

#----------------------------------------
def __getattr__(self,key):
""" Return by item name """
ukey = key.upper()
return self._data[tbl.FldNames.index(ukey)]
ukey = key.upper()

...where "tbl.FldNames" is the list of fieldnames that I'm trying to
build up above (and tbl is a property in the row pointing back to the
file object, since I don't want to make a copy of the fieldnames in
every row record). Is there a better (more efficient) way to go about
this? Thanks,

--
Greg

Egor Bolonev

unread,
Jun 30, 2003, 9:47:58 PM6/30/03
to
Hello, Greg!
You wrote on Mon, 30 Jun 2003 19:44:20 -0500:

??>>>> tbl.Fields()
GB> [('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,

[Sorry, skipped]

GB> but I expect that one of those fancy map/lamda/list comprehension
GB> functions can turn this into a list for me, but, to be honest, they
GB> still make my head spin trying to figure them out. Any ideas on how to
GB> do this simply?

=============================================
a=[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,


0), ('D-ACCRTL', 'C', 9, 0), ('D-ACCCST', 'C', 9, 0)]

b=[x[0] for x in a] # :-) Python is cool!

print b
=============================================

As I know the map/lamda/list works very slow and you should use it 'only'
with SCRIPTS.

[Sorry, skipped]

With best regards, Egor Bolonev. E-mail: ebol...@rol.ru

Greg Brunet

unread,
Jun 30, 2003, 10:22:05 PM6/30/03
to
"Egor Bolonev" <ebol...@rol.ru> wrote in message
news:bdqp9e$7r$1...@news.rol.ru...

> =============================================
> a=[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C',
30,
> 0), ('D-ACCRTL', 'C', 9, 0), ('D-ACCCST', 'C', 9, 0)]
>
> b=[x[0] for x in a] # :-) Python is cool!
>
> print b
> =============================================
>
> As I know the map/lamda/list works very slow and you should use it
'only'
> with SCRIPTS.

Thanks Egor!

It looks like I was closer than I thought - but still would have been
unlikely to figure it out!

--
Greg

Max M

unread,
Jul 1, 2003, 4:03:11 AM7/1/03
to
Greg Brunet wrote:

> but I'm not sure about how to do that. I can do this:
>
>>>>for g in tbl.Fields(): print g[0]
>
> ...
> STOCKNO
> DACC
> DEALERACCE
> D-ACCRTL
> D-ACCCST
>
> but I expect that one of those fancy map/lamda/list comprehension
> functions can turn this into a list for me, but, to be honest, they
> still make my head spin trying to figure them out. Any ideas on how to
> do this simply?

fields = [


('STOCKNO', 'C', 8, 0),
('DACC', 'C', 5, 0),
('DEALERACCE', 'C', 30, 0),
('D-ACCRTL', 'C', 9, 0),
('D-ACCCST', 'C', 9, 0)
]

# The "old" way to do it would be:
NAME_COLUMN = 0
results = []
for field in fields:
results.append(field[NAME_COLUMN])
print results


# But list comprehensions are made for exactly this purpose
NAME_COLUMN = 0
results = [field[NAME_COLUMN] for field in fields]
print results


regards Max M

Max M

unread,
Jul 1, 2003, 4:10:38 AM7/1/03
to Greg Brunet
Greg Brunet wrote:


> Even better yet... the reason I'm trying to do this is to make it easy
> to refer to a field by the field name as well as the field number. I
> expect to be reading all of the data into a list (1 per row/record) of
> row objects. If someone wants to be able to refer to FldA in record 53,
> then I'd like them to be able to use: "row[52].FldA" instead of having
> to use "row[52][4]" (if it's the 5th field in the row). I was planning
> on using the __getattr__ method in my row object like the following:


If that is all you want to do, this might be the simlest approach:

fields = [


('STOCKNO', 'C', 8, 0),
('DACC', 'C', 5, 0),
('DEALERACCE', 'C', 30, 0),
('D-ACCRTL', 'C', 9, 0),
('D-ACCCST', 'C', 9, 0)
]


def byName(field, name):
fieldNames = {
'name':0,
'letter':1,
'val1':2,
'val2':3,
}
return field[fieldNames[name]]


for field in fields:
print byName(field, 'name')


regards Max M

Bengt Richter

unread,
Jul 1, 2003, 4:07:43 PM7/1/03
to

Or you can take advantage of zip:

>>> fields = [
... ('STOCKNO', 'C', 8, 0),
... ('DACC', 'C', 5, 0),
... ('DEALERACCE', 'C', 30, 0),
... ('D-ACCRTL', 'C', 9, 0),
... ('D-ACCCST', 'C', 9, 0)
... ]
>>> zip(*fields)[0]
('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST')

Or a list of all the columns of which only the first was selected above:
>>> zip(*fields)
[('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST'), ('C', 'C', 'C', 'C', 'C'), (8, 5, 30
, 9, 9), (0, 0, 0, 0, 0)]

Since zip gives you a list of tuples, you'll have to convert if you really need a list version
of one of them:

>>> list(zip(*fields)[0])


['STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST']

Regards,
Bengt Richter

John Hunter

unread,
Jul 1, 2003, 9:07:58 PM7/1/03
to
>>>>> "Bengt" == Bengt Richter <bo...@oz.net> writes:

>>>> zip(*fields)

That is amazingly helpful. I've often needed to transpose a 2d list,
(mainly to get a "column" from an SQL query of a list of rows) and
this is just the trick.

I recently wrote a function to "deal" out a list of MySQLdb results,
where each field was a numeric type, and wanted to fill numeric arrays
with each column of results

With your trick, eg, for a list of results from three numeric fields,
I just have to do:

a1, a2, a3 = map(array, zip(*results))

John Hunter

Greg Brunet

unread,
Jul 1, 2003, 10:09:58 PM7/1/03
to
"Max M" <ma...@mxm.dk> wrote in message
news:3f013fb1$0$97259$edfa...@dread12.news.tele.dk...

Max:

Thanks for that clarification of list comprehensions. Now there's a
better "Greg comprehension" of the subject. [groan - I know that was
bad, but it's been a rough couple of days of programming, & it usually
brings that type of comment out, sorry ;) ]

Thanks for your help,

--
Greg

Greg Brunet

unread,
Jul 1, 2003, 11:20:06 PM7/1/03
to
"Bengt Richter" <bo...@oz.net> wrote in message
news:bdspmf$leq$0...@216.39.172.122...

> Or you can take advantage of zip:
>
> >>> fields = [
> ... ('STOCKNO', 'C', 8, 0),
> ... ('DACC', 'C', 5, 0),
> ... ('DEALERACCE', 'C', 30, 0),
> ... ('D-ACCRTL', 'C', 9, 0),
> ... ('D-ACCCST', 'C', 9, 0)
> ... ]
> >>> zip(*fields)[0]
> ('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST')
>
> Or a list of all the columns of which only the first was selected
above:
> >>> zip(*fields)
> [('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST'), ('C',
'C', 'C', 'C', 'C'), (8, 5, 30
> , 9, 9), (0, 0, 0, 0, 0)]
>
> Since zip gives you a list of tuples, you'll have to convert if you
really need a list version
> of one of them:
>
> >>> list(zip(*fields)[0])
> ['STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST']

Bengt:

This looks great - but something isn't quite working for me. If I type
in the stuff as you show, the zip function works, but if I use the
values that I get from my code, it doesn't. Here's what I get in a
sample session:

#------------------------------------
>>> ff=dbf('nd.dbf')
>>> ff.Fields()
[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,
0), ('D_ACCRTL', 'C', 9, 0), ('D_ACCCST', 'C', 9, 0), ('DEC', 'N', 10,
2)]
>>> zip(*ff.Fields())
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
TypeError: zip argument #1 must support iteration
#------------------------------------

Where the "dbf" invoices the _init_ for the dbf class which opens the
file & reads the header. As part of that, the fields are placed in a
class variable, and accessed using the Fields() method. At first I
wasn't sure of what the '*' did, but finally figured that out (section
5.3.4-Calls of the Language Reference for anyone else who's confused).

After puzzling it through a bit, I believe that Fields() is causing the
problem because it's not really a list of tuples as it appears. Rather
it's a list of dbfField objects which have (among others) the following
2 methods:

class dbfField:
#----------------------------------------
def __init__(self):
pass

#----------------------------------------
def create (self, fldName, fldType='C', fldLength=10, fldDec=0):
# (lot's of error-checking omitted)
self._fld = (fldName, fldType, fldLength, fldDec)
#----------------------------------------
def __repr__(self):
return repr(self._fld)
#----------------------------------------
def __getitem__(self,key):
""" Return by position or item name """
if type(key) is IntType:
return self._fld[key]
elif type(key) is StringType:
ukey = key.upper()
if ukey=="NAME": return self._fld[0]
elif ukey=="TYPE": return self._fld[1]
elif ukey=="LENGTH": return self._fld[2]
elif ukey=="DEC": return self._fld[3]


What I was trying to do, was to use the _fld tuple as the main object,
but wrap it with various methods & properties to 'safeguard' it. Given
that can I still use zip to do what I want? (Egor & Max's list
comprehension solution works fine for me, but the zip function seems
especially elegant) I read in the library reference about iterator
types (sec 2.2.5 from release 2.2.2), and it looks like I could get it
to work by implementing the iterator protocol, but I couldn't find any
sample code to help in this. Any idea if there's some available, or if
this is even worth it.

Better yet, is there a way for me to accomplish the same thing in a
simpler way? It's likely that I'm 'brute-forcing' a solution that has
gotten to be a lot more complex than it needs to be. Certainly if the
field definitions were in a simple tuple (which it is internally), zip
would work, but then it seems that I would lose the encapsulation
benefits. Is there a way to achieve both?

Thanks again,

--
Greg

Bengt Richter

unread,
Jul 2, 2003, 5:04:13 AM7/2/03
to

That looks neat. But it isn't really "my" trick (whatever that means ;-).
Anyway, IIRC, I learned it from a Martellibot post some time back ;-)

Regards,
Bengt Richter

0 new messages