slicing dataframes MultiIndex lexsort depth error

1,691 views
Skip to first unread message

Martin De Kauwe

unread,
Sep 23, 2013, 3:53:40 AM9/23/13
to pyd...@googlegroups.com
Hi,

(apologies if cross posted, i sent it to the wrong mailing list I think https://groups.google.com/forum/#!topic/pystatsmodels/LL486HdjLfs)

I am reading multiple CSV files which are quite large 4383 rows x 80 columns and merging them to create a large dataframe which I then use later on. For a few of these CSV files everything works fine, but as I increase the number I run into an error I don't understand.

KeyError: 'MultiIndex lexsort depth 0, key was length 3'

If I generate a test example

import pandas as pd
import numpy as np
import datetime as dt
import cPickle as pickle

model_list = ["GDAY","SDVM","LPJX"]
#model_list = ["GDAY","SDVM"]

df_list = []
key_list = []
treatment = "AMB"
exp = "AVG"
for model in model_list:
    
    df = pd.DataFrame(np.random.randn(4383, 80), 
              index=pd.date_range('20010101', periods=4383),
              columns=['YEAR','DOY','CO2','PPT','PAR','AT','ST','VPD',\
                   'SW','NDEP','NEP','GPP','NPP','CEX','CVOC','RECO',\
                   'RAUTO','RLEAF','RWOOD','RROOT','RGROW','RHET',\
                   'RSOIL','ET','T','ES','EC','RO','DRAIN','LE',\
                   'SH','CL','CW','CCR','CFR','TNC','CFLIT',\
                   'CFLITA','CFLITB','CCLITB','CSOIL','GL',\
                   'GW','GCR','GR','CLLFALL','CRLIN','CWIN','LAI',\
                   'LMA','NCON','NCAN','NWOOD','NCR','NFR',\
                   'NSTOR','NLIT','NRLIT','NDW','NSOIL','NPOOLM',\
                   'NPOOLO','NFIX','NLITIN','NWLIN','NRLIN','NUP',\
                   'NGMIN','NMIN','NVOL','NLEACH','NGL','NGW',\
                   'NGCR','NGR','APARd','GCd','GAd','GBd','Betad'])
           
    df_list.append(df)

    # allows us to select by m, s or t
    key_list.append((model,treatment,exp))  
dfs = pd.concat(df_list, axis=1, keys=key_list, 
                   names=["model","treatment","exp"])
dfs.to_pickle("models_output.pkl")


dfs = pd.read_pickle("models_output.pkl")
print dfs["GDAY","AMB","AVG"]

this will produce the error, but not when the loop over models is only two elements. How can I fix this?

thanks

Jeff Reback

unread,
Sep 23, 2013, 6:20:23 AM9/23/13
to pyd...@googlegroups.com
--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Martin De Kauwe

unread,
Sep 23, 2013, 6:42:02 AM9/23/13
to pyd...@googlegroups.com
I saw that thanks, but I am having trouble following it and applying it to my test case to be honest. Suggestions welcome...

Jeff

unread,
Sep 23, 2013, 8:21:24 AM9/23/13
to pyd...@googlegroups.com
dfs.columns.lexsort_depth
0

This is not sorted.

dfs = dfs.sortlevel(0,axis=1)

dfs.columns.lexsort_depth
4

dfs[("GDAY","AMB","AVG")]
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 4383 entries, 2001-01-01 00:00:00 to 2012-12-31 00:00:00
Freq: D
Data columns (total 80 columns):
YEAR       4383  non-null values
DOY        4383  non-null values
CO2        4383  non-null values
PPT        4383  non-null values
PAR        4383  non-null values
AT         4383  non-null values
ST         4383  non-null values
VPD        4383  non-null values
SW         4383  non-null values
NDEP       4383  non-null values
NEP        4383  non-null values
GPP        4383  non-null values
NPP        4383  non-null values
CEX        4383  non-null values
CVOC       4383  non-null values
RECO       4383  non-null values
RAUTO      4383  non-null values
RLEAF      4383  non-null values
RWOOD      4383  non-null values
RROOT      4383  non-null values
RGROW      4383  non-null values
RHET       4383  non-null values
RSOIL      4383  non-null values
ET         4383  non-null values
T          4383  non-null values
ES         4383  non-null values
EC         4383  non-null values
RO         4383  non-null values
DRAIN      4383  non-null values
LE         4383  non-null values
SH         4383  non-null values
CL         4383  non-null values
CW         4383  non-null value
Reply all
Reply to author
Forward
0 new messages