http://www.amazon.com/Programming-Collective-Intelligence-Building-Applications/dp/0596529325/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1200760781&sr=8-1
Programming Collective Intelligence: Building Smart Web 2.0
Applications [ILLUSTRATED] (Paperback)
by Toby Segaran (Author)
I think this book epitomizes everything that process.theinfo is about;
it first covers scraping the web and collecting information from large
datasets and then moves on into ranking, data mining, decision trees,
clustering. There are practical examples like how would you cluster
similar blogs based on their content.
The subject matter is definitely something from a computer science
textbook but the approach is not; it is an O'Reilly book so the
examples are practical and doesn't drown you in formulas but they are
there if you need some. Code is in python so the examples are pretty
readable.
So, I really suggest others pick this up, especially if you don't have
any experience with analyzing text data. I say that because I don't
and this was the right level for me. A PHD student may not find the
content very interesting.
The response has also been favorable; so far he has 17 reviews on
Amazon and most of then are perfect.