SpaceBase/Galaxy disk backing

48 views
Skip to first unread message

Elliot Crosby-McCullough

unread,
Jul 26, 2014, 1:19:57 PM7/26/14
to spaceba...@googlegroups.com
Hi,

I'm looking at the puniverse stack for an upcoming project (everything except the HTTP layer) and I've a question about SpaceBase & Galaxy which often comes up when I encounter in-memory stores.

Galaxy supports disk persistence, which is great, and provides clustering to allow SpaceBase to expand into a large RAM pool, which is also great, but there are frequently times when certain parts of a dataset are not worth the cost of keeping them hot in RAM.  In these situations it seems like even a basic least-recently-used algorithm for retiring elements to disk would be of great benefit, even with the performance cost of pulling them back into RAM when (if?) they are accessed again.

Is this something coming for the future of SpaceBase?  I note SB isn't open source but Galaxy is, so perhaps this is something I could look at adding to Galaxy.

As an aside, is the "2 or 3 dimensions" restriction a hard limit in SB?  I've done work in the past which needed a lot of the calculations SB has but on data sets with 10k+ dimensions for object similarity work, plus I'd love to sim 4 spatial dimensions for a project I've got coming up.

Regards,
Elliot

pron

unread,
Aug 5, 2014, 6:40:05 AM8/5/14
to spaceba...@googlegroups.com
Hi Elliot.
Offloading to disk is something that might find its way to Galaxy, but we're not sure the use case is there. Our stack is primarily intended for the hot working set, and anything else can be stored in other DBs, with higher latencies. If anything, we see RAM being underutilized rather than over-utilized. 

As to the 2-3 dimensions: yes, this is a hard limit. One more dimension can be easily added, but no more: data with high dimensionality requires different data-structures than the one used by SpaceBase.

Ron
Reply all
Reply to author
Forward
0 new messages