Is bandicoot advisable for a high traffic ecommerce site?

cyx

unread,

Aug 8, 2011, 3:04:46 PM8/8/11

to bandicoot

Important points:

- fluctuating traffic patterns (peaking at 1K transactions / second, a
transaction spanning multiple relations)
- high storage for certain relations
- strict durability / SLAs required?

Thanks!
Cyril

Ostap Cherkashin

unread,

Aug 11, 2011, 1:08:56 PM8/11/11

to band...@googlegroups.com

Thanks for a good question. In brief, even though this is the direction where the bandicoot is heading, unfortunately we are not there yet. There are still some inefficiencies left but it should be just a mater of time. Please see my answers below.

On Aug 8, 2011, at 9:04 PM, cyx wrote:
> Important points:
>
> - fluctuating traffic patterns (peaking at 1K transactions / second, a
> transaction spanning multiple relations)

With regard to the number of transactions per time interval, there is quite some inefficiency in the current stable v3. For stability purposes each transaction is handled in a separate process, but v3 implements it in a suboptimal way:
* a new process is forked for each transaction
* several new sockets are interconnected for each tx

Both of the points above result in a several millisecond overhead and potentially large numbers of sockets in the TIME_WAIT state (the later can cause processor hangs when no sockets left). Julius is working on a patch to address both of these points and we plan to include it in v4 (a couple of weeks from now). Also, it is worth mentioning that the source code comes with some performance tests (ctl perf) and very basic stress tests (ctl stress_read and ctl stress_write). This can help you to conduct the experiments and check the feasibility of using bandicoot.

WRT transactions spanning multiple relations it is difficult to give a clear answer without knowing the actual program functions and relational variables. In brief, bandicoot ensures the data consistency across transactions, and thus it serializes conflicting reads with writes (but not reads with reads). There is much more on this subject at http://bandilab.org/blog.html.

> - high storage for certain relations

Depending on the frequency of writes the amount of storage required for transaction processing can be inadequately large. Each time a variable is changed bandicoot creates a new version (copy + change) of the variable (it is all on a per-variable basis). It is a big subject and actually I am working on a blog post which both describes the current state and also suggests several improvements. I will let you know once it is published.

> - strict durability / SLAs required?

Before a transaction changes its state to committed all of the variables involved are written to the volume directory (-d startup parameter) and also the new state is written to the state file (-s flag). Once a transaction is committed this is then communicated to a user (HTTP 200). In case of a failure bandicoot should be able to recover to the last committed transaction provided the same volume and state. It is important to note though that bandicoot does not perform fdatasync calls. If you want to ensure that all the data is physically written to disk before something is communicated to a user you can mount the storage with the sync mount option (available on most unixes).

cyx

unread,

Aug 23, 2011, 1:23:01 PM8/23/11

to bandicoot

Hi,

Thanks so much for the very lengthy and comprehensive answer!

We'll still push through with it and hope for the best just because we
want to
explore Bandicoot further.

cyx

On Aug 11, 10:08 am, Ostap Cherkashin <ostap.cherkas...@gmail.com>
wrote:

> Thanks for a good question. In brief, even though this is the direction where the bandicoot is heading, unfortunately we are not there yet. There are still some inefficiencies left but it should be just a mater of time. Please see my answers below.
>
> On Aug 8, 2011, at 9:04 PM, cyx wrote:
>
> > Important points:
>
> > - fluctuating traffic patterns (peaking at 1K transactions / second, a
> > transaction spanning multiple relations)
>
> With regard to the number of transactions per time interval, there is quite some inefficiency in the current stable v3. For stability purposes each transaction is handled in a separate process, but v3 implements it in a suboptimal way:
> * a new process is forked for each transaction
> * several new sockets are interconnected for each tx
>
> Both of the points above result in a several millisecond overhead and potentially large numbers of sockets in the TIME_WAIT state (the later can cause processor hangs when no sockets left). Julius is working on a patch to address both of these points and we plan to include it in v4 (a couple of weeks from now). Also, it is worth mentioning that the source code comes with some performance tests (ctl perf) and very basic stress tests (ctl stress_read and ctl stress_write). This can help you to conduct the experiments and check the feasibility of using bandicoot.
>

> WRT transactions spanning multiple relations it is difficult to give a clear answer without knowing the actual program functions and relational variables. In brief, bandicoot ensures the data consistency across transactions, and thus it serializes conflicting reads with writes (but not reads with reads). There is much more on this subject athttp://bandilab.org/blog.html.

Reply all

Reply to author

Forward