pandas dataframe to namedtuple

2,953 views
Skip to first unread message

zach cp

unread,
Oct 27, 2012, 5:15:50 PM10/27/12
to pyd...@googlegroups.com
Is there an easy way to convert a DataFrame to a list of namedTuples?  I am using pandas for some data munging and would then like to pass the the data into a Jinja template for a web-based presentation. I have been converting the pandas dataframe to either lists of tuples and then selecting using something like this:

#pandas
df = DataFrame(np.random.randn(100, 4), columns=['Sample','Average','StandardDev','Variance'])
list_of_tuples = list(df.itertuples())

#jinja
{% for item in list_of_tuples %}
    {{ item[1][0] }}
    {{ item[1][1] }}
{% endfor %}


I would like to be able to return a named tuple where the names come fromt he column names. I could then use this formulation instead:

#pandas
list_of_namedtuples = list(some_function(df))
#jinja
{% for item in list_of_tuples %}
    {{ item.Sample }}
    {{ item.Average }}
{% endfor %}


thanks,
zach cp

Dan Davison

unread,
Oct 27, 2012, 5:49:20 PM10/27/12
to pyd...@googlegroups.com

Hi Zach,

I agree that could be attractive. The straightforward implementation would be something like

from collections import namedtuple

def iternamedtuples(df):
    Row = namedtuple('Row', df.columns)
    for row in df.itertuples():
        yield Row(*row[1:])


Did you have something more than this in mind?

2012/10/27 zach cp <zach.char...@gmail.com>

--
 
 

Zach Charlop-Powers

unread,
Oct 27, 2012, 5:55:12 PM10/27/12
to pyd...@googlegroups.com
Thanks Dan,

that is basically exactly what I had in mind although your function is way cleaner than what I had put together. Thanks a lot!


zach cp



--
 
 

zach cp

unread,
Oct 27, 2012, 6:59:42 PM10/27/12
to pyd...@googlegroups.com
Dan,

I have a small follow-up for you if you are still there.  If I use youtt function interactively I am fine, but If I include it as part of script, it trips an error because python thinks 'df' is a string:

ipython-input-487-b0961288d5e6> in iternamedtuples(df)
      1 def iternamedtuples(df=DataFrame()):
----> 2     Row = namedtuple('Row', df.columns)
      3     for row in df.itertuples():
      4         yield Row(*row[1:])

AttributeError: 'str' object has no attribute 'columns'


Can you declare a variable in a function as a pandas object?  

thanks
zach cp

Jonathan Rocher

unread,
Oct 28, 2012, 10:29:40 AM10/28/12
to pyd...@googlegroups.com
Zach,

this error is probably due to what you are calling your iternamedtuples function with. What is the code calling your function?

Jonathan

Ps In case you are new to python, let me remind you that dynamic typing allows you to call the function with any argument, and execution will work or break based on what you are doing inside the function.

--
 
 



--
Jonathan Rocher, PhD
Scientific software developer
Enthought, Inc.
jrocher@enthought.com
1-512-536-1057
http://www.enthought.com


zach cp

unread,
Nov 1, 2012, 1:30:55 PM11/1/12
to pyd...@googlegroups.com
Jonathan,


 I tried the simple script below with no issues and then revisited my original problem script and also had no problem. So I don't know what the original issue was but if I encounter it again I'll post in this thread. Thanks for your time.

zach cp


from pandas import *
import numpy as np
randn = np.random.randn
from collections import namedtuple

df = DataFrame(randn(8, 4), columns=['A', 'B', 'C', 'D'])

def iternamedtuples(df):
    Row = namedtuple('Row', df.columns)
    for row in df.itertuples():
        yield Row(*row[1:])

df_list  = list(iternamedtuples(df)) 
for x in df_list:
    print x
Reply all
Reply to author
Forward
0 new messages