I have used the Table.append() method to create an empty Table, then add multiple rows of data to it. Now I am trying to write a procedure to add data by column (field name) instead of rows. I have done this with h5py, but can't figure out how to do the same with PyTables. I create the Table with description= but can't figure out how to allocate the number of rows. Is this possible?
Why I am doing this? Some very large datasets are easier to copy on a column-by-column basis instead of row-by-row.
To demonstrate, here is a very simple example.
import numpy as np
col_int = [ i for i in range(10) ]
col_fl = [ float(i) for i in range(10) ]
recarr = np.empty(dtype=[('col_int',int),('col_fl',float)], shape=(10,))
recarr['col_int'] = np.array(col_int)
recarr['col_fl'] = np.array(col_fl)
import tables as tb
with tb.File('file_tb.h5', 'w') as h5f:
# Create table with data using obj parameter:
NXgrp_tbl1 = h5f.create_table('/','data1', obj=recarr )
NXgrp_tbl2 = h5f.create_table('/','data2', description=recarr.dtype )
# Create table with data using
description parameter:
for colName in recarr.dtype.fields:
# this doesn't work -- how to I do it?? :
NXgrp_tbl2[colName] = recarr[colName]
All help is appreciated. Thanks,
Ken Walker