> - Suggestions for the next paper and/or blog articles to read? The
> few comments I captured last night were:
> * The Dynamo paper is a must read at some point in the summer
> * It would be good to establish a baseline understanding of all the
> terms and different types of technologies
> * A survey NoSQL landscape would be great and there seemed to be
> particular interest in Cassandra.
> Take a look at the list of papers on the NoSQL Summer website:
> http://nosqlsummer.org/papers
There are a few things that didn't make the official list that I thought I'd throw out there:
Relational Databases Considered Harmful, H. Baker (1991 letter to the ACM Forum)
http://home.pipeline.com/~hbaker1/letters/CACM-RelationalDatabases.html
Some good documents on tuplespaces came out of IBM's TSpaces project, but the best one, "A Universal Information Appliance" has disappeared behind a paywall. This one, is shorter and more abstract, but introduces the topic which I think predates and anticipates many features of the current crop of NoSQL systems.
Shirt Pocket Transactions
http://www.almaden.ibm.com/cs/TSpaces/papers/shirt.html
Before relational databases there were object databases, and some are still around (Gemstone, ZODB). This paper argues their cause:
The Object-Oriented Database System Manifesto
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/Manifesto.html
Prevaylent Persistence is not a database as such, but covers some of the same ground as the NoSQL databases do. Again, hard to find one good article, but their quick intro provides a summary:
http://www.prevayler.org/wiki/
These are not about the current (rapidly changing) state of the art, but more about how we got here. I don't know how much others are interested in the historical context behind the current popularization of NoSQL, but thought I'd put these out for consideration.
--Dethe
Oh ya, this is a great rant.
+1 for Dynamo.
I personally would want discuss a few context setting papers about how
people are using nosql in the real world.
What are the situations in which nosql solutions become attractive &
sensible. And what are the extreme conditions where they become
necessary. And many of these are based on relational databases used
non-relationally. This is the space i find most fascinating,
Flickr, on sharding and using a ticket server
http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/
FriendsFeed on schema-less use of mysql
http://bret.appspot.com/entry/how-friendfeed-uses-mysql
Reddit on scaling postgres (cache everything. denormalize everything)
http://carsonified.com/blog/dev/steve-huffman-on-lessons-learned-at-reddit/
Or,
Digg's using cassandra,
http://about.digg.com/blog/looking-future-cassandra
"A Dismal Guide to Concurrency"
http://www.facebook.com/notes/facebook-engineering/a-dismal-guide-to-concurrency/379717628919