Late again but these meetings have been so good I think we should
still get together.
There isn't a paper listed on nosqlsummer.org specifically for Hadoop
but there's been some interest so let's focus on that tonight.
The problem with Hadoop is it's not one technology but a set of
disparate tools. Since Zookeeper has come up a few times in past
discussions, I'd love to hear a little more about it tonight.
The two papers stated as inspiring Hadoop are the Google Filesystem
and Google MapReduce papers. We've already read the MapReduce paper so
that one should just be refresher!
http://nosqlsummer.org/paper/google-mapreduce
http://labs.google.com/papers/gfs-sosp2003.pdf
And, to whet your appetite, here's an awesome story of Hadoop in action:
"As an example The New York Times used 100 Amazon EC2 instances and a
Hadoop application to process 4TB of raw image TIFF data (stored in
S3) into 11 million finished PDFs in the space of 24 hours at a
computation cost of about $240 (not including bandwidth)"
Dan
(couldn't resist)
Ted
Sent from my iPhone