Best practice to store bcolz data to a db

59 views
Skip to first unread message

guil...@stratifi.com

unread,
Feb 14, 2017, 7:21:54 PM2/14/17
to bcolz
Hi, 

What would be the best practice for storing bcolz data into a db, as opposed to storing it on disk? 

We are working with bcolz files and would like to store them in a db-like fashion, meaning, we would connect to the db and retrieve / rebuild the bcolz files by doing some sort of query, as opposed to storing the bcolz files on some external disk. 

We do need to preserve the bcolz format though, since our program expects bcolz files as input, but we would want to avoid having to store the bcolz simply on disk.

Thank you!

Guillaume

Francesc Alted

unread,
Feb 15, 2017, 3:02:35 AM2/15/17
to Bcolz
Hi Guillaume,

Why not adding an RPC layer on top of bcolz so that you can use it from other machines and languages?  Any Rest protocol would fit the bill, but I find Google Protocol Buffers really easy to use and fast (I am reaching tens of thousands of messages per second quite easily).  Besides, if you make the PB messages to work on top of gRPC, then performance really shines:

 
Hope this helps,


This e-mail does not constitute any investment advice, any offer to perform investment advisory services or any solicitation or offer to buy or sell any securities. No representation is made on the fairness, accuracy or completeness of the information contained in this e-mail, and the sender does not accept liability for any errors or omissions in the contents of this email that arise as a result of e-mail transmission. Certain assumptions may have been made in the preparation of this e-mail that are subject to change without notice. The sender undertakes no obligation to update the information in this e-mail. The sender does not waive any rights, privileges or other protections that the sender may have with respect to the information in this e-mail. This message, including attachments, is intended only for the use of the individual or entity to whom it is addressed, and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you have received this email in error, please delete it, notify the sender and do not retain, use, copy or disseminate this email without the sender's consent. StratiFi LLC disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message.

--
You received this message because you are subscribed to the Google Groups "bcolz" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bcolz+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Francesc Alted

Carst Vaartjes

unread,
Feb 16, 2017, 3:54:47 AM2/16/17
to bcolz, fal...@gmail.com
Hi,

this might be interesting: https://github.com/visualfabriq/bqueryd
it's a zeromq-based, distributed query & file distribution system for bquery/bcolz. It can break large bcolz files into smaller "shards" and use that to massively parallelize data analysis

BR

Carst
To unsubscribe from this group and stop receiving emails from it, send an email to bcolz+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Francesc Alted
Reply all
Reply to author
Forward
0 new messages