with openFile(pathToH5, 'r') as f:
tab = f.getNode("/pageTable")
dversions = dict((i, None) for i in versions)
dsheets = dict((i, None) for i in sheets)
dpages = dict((i, None) for i in pages)
df = pd.DataFrame([[row['page'],row['index0'], row['value0'] ] for row in tab.where('(firstVersion == 0) & (ok == 1)') if row['version'] in dversions and row['sheetNum'] in dsheets and row['pages'] in dpages ], columns=['page','index0', 'value0'])
df2 = pd.DataFrame([[row['page'],row['index1'], row['value1'] ] for row in tab.where('(firstVersion == 1) & (ok == 1)') if row['version'] in dversions and row['sheetNum'] in dsheets and row['pages'] in dpages], columns=['page','index1', 'value1'])
for i in dpages:
m10 = df.loc[df['page']==i]['index0'].mean()
s10 = df.loc[df['page']==i]['index0'].std()
m20 = df.loc[df['page']==i]['value0'].mean()
s20 = df.loc[df['page']==i]['value0'].std()
m11 = df2.loc[df2['page']==i]['index1'].mean()
s11 = df2.loc[df2['page']==i]['index1'].std()
m21 = df2.loc[df2['page']==i]['value1'].mean()
s21 = df2.loc[df2['page']==i]['value1'].std()
yield (i,m10, s10), (i,m11, s11), (i,m20,s20), (i,m21,s21))
#####################################################
I have been reading some pandas documentation, and the cookbook, but I think I do not get yet how should I work when the data is stored in a big file like PyTables, and It need to be processed.
I also know about the limitations of python 32 regarding memory, but still, I think this might be able to work in a 32bit machine.
Any help would be appreciated.
Thanks.
--
You received this message because you are subscribed to the Google Groups "PANDA Project Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to panda-project-u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to panda-project-users+unsub...@googlegroups.com.