Third meeting (14th July) reminder

Makoto

unread,

Jul 10, 2010, 6:49:51 AM7/10/10

to nosql-summer-london

Hi,

This is a quick reminder for our next nosql meetup. Our next meeting
is NOT at skillsmatter, but at Acunu office(http://www.acunu.com/
contact.html).

I only got 2 replies about attendance. If you are thinking about
attending, please reply to the email I sent on 1st July, or reply to
this thread.

Date&Time:14th July , 6:30pm
Place: http://www.acunu.com/contact.html
Papaers:
http://nosqlsummer.org/paper/cassandra
http://nosqlsummer.org/paper/amazon-dynamo

Other Notes:
* Acunu will try to provide a limited bbq, but please feel free to
bring your own meat/bbq items
* Since cassandra is actual open source product, you might want to try
installing and playing around it.I found the following blog post
useful (written by one of Twitter engineers).

http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/

If you know any other good online resources, please post,too.

Makoto

Finn Neuik

unread,

Jul 11, 2010, 5:39:12 AM7/11/10

to nosql-sum...@googlegroups.com

Hi there - yep, I'll be along.

Have fun
Finn

Michael Lenahan

unread,

Jul 12, 2010, 3:06:58 AM7/12/10

to nosql-sum...@googlegroups.com

I'll be there!

Michael

On 10 July 2010 11:49, Makoto <inou...@googlemail.com> wrote:

tav

unread,

Jul 12, 2010, 11:07:47 AM7/12/10

to nosql-sum...@googlegroups.com

Hey all,

Unfortunately I won't be able to make it to the next meeting — I've
got some financial troubles to deal with and sadly that takes priority
=(

However, I did thoroughly enjoy meeting many of you at the last meetup
and would like to contribute in any way that I can.

So, in the hopes of being helpful, here are some comments:

* I would group Dynamo and Cassandra as "eventually consistent
datastores" as opposed to "systems which use consistent hashing".
Cassandra thankfully supports partitioning schemes other than
consistent hashing...

* I would recommend adding Riak to the set. It's the best open source
implementation of Dynamo that I've seen:

http://basho.com

* The novel contributions that Dynamo made to distributed systems
research was two-fold: (a) the specific manner in which
aoo-level-involved conflict resolution is handled and (b) tunable
parameters to control the desired levels of performance, availability
and durability. Everything else, e.g. vector clocks, had already been
worked to death before.

* The last time I dived into Cassandra they hadn't implemented vector
clocks yet — so you'd have lots of opportunity for data loss if your
machine clocks were to be out of sync. Fun! This is planned to be
fixed for 0.7 afaik — see issue 580 for more info.

* Cassandra's data model is nice — in so far as that it effectively
follows something similar to BigTable's structure if you use the order
preserving partitioner. The biggest issue is their convoluted
terminology.

* Riak is relatively easy to administer, whereas it's something of a
dark art to administer Cassandra clusters. I don't know if they've yet
fixed the requirement to restart the entire cluster every time you
wanted to change the data model, i.e. modify Column Families and
Keyspaces.

* As you know I'm a big fan of secondary indexes. As with most NoSQL
datastores, you have to create and manage these yourself here. But
thanks to its range queries support, Cassandra fares much better at
this than Riak's MapReduce functionality.

* Eventually consistent datastores don't provide native support for
transactions. If this is important for your applications, you can use
an external means of synchronising your changes. Cages is a Java
library which provides support for coarse locks on top of ZooKeeper:

http://code.google.com/p/cages/

* Personally, I'm not a big fan of ZooKeeper, so would strongly
recommend building something on top of Keyspace instead for this
purpose:

http://scalien.com/keyspace/

* FInally, the biggest impact of using eventually consistent
datastores is the massive change you have to make in how you design
your applications. They now have to deal with conflict resolution and
new sets of edge cases which require careful attention to detail.

Anyways, I hope the above proves useful in some way.

Hope you're all having a great day!

--
love, tav

plex:espians/tav | t...@espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian

tav

unread,

Jul 12, 2010, 11:22:00 AM7/12/10

to nosql-sum...@googlegroups.com

Ooops, "aoo-level" should have read "app-level" and I forgot to
mention the big news which many of you probably already know about:
Twitter sticking to MySQL instead of switching to Cassandra:

* http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html

Dan

unread,

Jul 13, 2010, 5:49:20 PM7/13/10

to nosql-sum...@googlegroups.com

Hey, thanks for the reply, I'll try and take as much of this as I can into the discussions. I've put a few comments in below too :-

On Mon, Jul 12, 2010 at 4:07 PM, tav <t...@espians.com> wrote:

Hey all,

Unfortunately I won't be able to make it to the next meeting — I've
got some financial troubles to deal with and sadly that takes priority
=(

However, I did thoroughly enjoy meeting many of you at the last meetup
and would like to contribute in any way that I can.

So, in the hopes of being helpful, here are some comments:

* I would group Dynamo and Cassandra as "eventually consistent
datastores" as opposed to "systems which use consistent hashing".
Cassandra thankfully supports partitioning schemes other than
consistent hashing...

I'd be interested to know how the different partitioning schemes affects Cassandra's performance? I guess most people would benchmark the default settings, is that the order preserving hash function by default?

What parts of ZooKeeper don't you like? I guess it some ways it and related technologies could be another evenings discussions.

Reply all

Reply to author

Forward