using where or where_terms (bquery) in ctable

7 views

Skip to first unread message

Sanchita Agarwal

unread,

Oct 21, 2019, 4:57:56 PM10/21/19

to bcolz

I have recently started using bcolz and learning to use it for several use cases.

I created a ctable with nbytes ~ 155 GB and cbytes ~ 5GB (totally awesome, the way bcolz compresses such huge data)

Now that I have the data, my objective is to search through such large data from 1000s of .blp files.

These are the steps I tried:

>>>import bcolz

>>>import bquery

>>>a= bquerry.open('<path to the table>',mode='r')

>>>print(a)

ctable((41603242,), [('lines', '<U1000')])

nbytes: 154.98 GB; cbytes: 5.21 GB; ratio: 29.76

cparams := cparams(clevel=5, shuffle=1, cname='lz4', quantize=0)

rootdir := '<path_to_table>'

Note : The table is typically a log with <date> <timestamp> <levelname> <text> . # I have placed entire line as string for easy search through

>>>a.where_terms([('lines','in',['INFO'])])

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/root/venv/lib64/python3.6/site-packages/bquery/ctable.py", line 739, in where_terms

ctable_ext.apply_where_terms(ctable_iter, op_list, value_list, boolarr)

File "bquery/ctable_ext.pyx", line 1283, in bquery.ctable_ext.apply_where_terms (bquery/ctable_ext.c:43492)

File "bquery/ctable_ext.pyx", line 1313, in bquery.ctable_ext.apply_where_terms (bquery/ctable_ext.c:42740)

TypeError: an integer is required

How should I approach this? Also using bquery would be a good idea to search through such huge data quickly or using list comprehensions would be quicker?

I did not understand how bcolz/bquery handles searching operations and what is the time complexity? Could someone shed some light on this and guide me.

Reply all

Reply to author

Forward

0 new messages