Bcolz append and modifying data

277 views
Skip to first unread message

Michael Schatzow

unread,
Apr 27, 2015, 11:15:24 AM4/27/15
to bc...@googlegroups.com
Is there any preferred way to handle modification of data in bcolz? I have a data set in which the last 15 days or so is subject to minor modifications and the rest is static. Should I be creating a static ctable each day and recreate the ones that could be modified?

Francesc Alted

unread,
Apr 27, 2015, 11:25:40 AM4/27/15
to Bcolz
Hi Michael,

2015-04-27 17:15 GMT+02:00 Michael Schatzow <michael....@gmail.com>:
Is there any preferred way to handle modification of data in bcolz? I have a data set in which the last 15 days or so is subject to minor modifications and the rest is static. Should I be creating a static ctable each day and recreate the ones that could be modified?

Due to the chunked nature of bcolz, I don't think you need a different table per day.  So you can use a single ctable for all the days, and when you update the info, then only the corresponding chunk will be updated.

Hope this helps,
Francesc

Michael Schatzow

unread,
Apr 27, 2015, 6:11:30 PM4/27/15
to bc...@googlegroups.com, fal...@gmail.com
Francesc,
  Thank you for the super fast reply.  Sorry, I am a little dense with this.  So how would you actually update the days that need to be updated?  Would you delete where the np.datetime64 is between some daterange and then append?

I have giant pandas dataframes that I am pushing into ctables. and want to use the new pandas dataframe to recreate days that need to be rewritten.

Francesc Alted

unread,
Apr 28, 2015, 2:19:12 AM4/28/15
to Bcolz
2015-04-28 0:11 GMT+02:00 Michael Schatzow <michael....@gmail.com>:
Francesc,
  Thank you for the super fast reply.  Sorry, I am a little dense with this.  So how would you actually update the days that need to be updated?  Would you delete where the np.datetime64 is between some daterange and then append?

Deleting rows is not actually supported in bcolz, but updating the rows would be reasonably fast, and can be used in combination with conditions (which seems to be your case):

 

I have giant pandas dataframes that I am pushing into ctables. and want to use the new pandas dataframe to recreate days that need to be rewritten.

On Monday, April 27, 2015 at 11:25:40 AM UTC-4, Francesc Alted wrote:
Hi Michael,

2015-04-27 17:15 GMT+02:00 Michael Schatzow <michael....@gmail.com>:
Is there any preferred way to handle modification of data in bcolz? I have a data set in which the last 15 days or so is subject to minor modifications and the rest is static. Should I be creating a static ctable each day and recreate the ones that could be modified?

Due to the chunked nature of bcolz, I don't think you need a different table per day.  So you can use a single ctable for all the days, and when you update the info, then only the corresponding chunk will be updated.

Hope this helps,
Francesc




--
Francesc Alted

Michael WS

unread,
Apr 29, 2015, 9:53:33 AM4/29/15
to bc...@googlegroups.com
Ok, so the best way would be just to find the index where i want start modifying and then write over all of the indices.

Su Kai

unread,
Jun 12, 2018, 4:27:21 AM6/12/18
to bcolz
the original bcolz table can be modified only when the query condition with one level.
example:
table["(f0>0) & (f1<10)"] = (1, 2)
can modify the table,

but
 table["(f0>0) & (f1<10)"][0] = 1

will not modify the table.


在 2015年4月28日星期二 UTC+8下午2:19:12,Francesc Alted写道:
Reply all
Reply to author
Forward
0 new messages