gem install classifier
-or-
http://rubyforge.org/projects/classifier/
Brand new to this release of Classifier is a string method called
#summary that takes full advantage of LSI's ability to find the most
important sentences or paragraphs out of a block of text. Here is an
example usage:
require 'classifier'
require 'open-uri'
open('http://rufy.com/pickaxe-intro.txt').read.gsub(/<[^>]*>/,"").summary
Produces the following summarization of
http://rubycentral.com/book/foreword.html:
"If you don't believe me, read this book and try Ruby [...] But I was
still hoping to design a language that would work for most of the jobs
I did everyday [...] Ruby has never been a well-documented language
[...] While they were writing it, I was modifying the language itself
[...] Shortly after I was introduced to computers, I became interested
in programming languages [...] As an object-oriented fan for more than
fifteen years, it seemed to me that OO programming was very suitable
for scripting too [...] I wanted a language more powerful than Perl,
and more object-oriented than Python [...] I believed that an ideal
programming language must be attainable, and I wanted to be the
designer of it [...] Because I have always preferred writing programs
over writing documents, the Ruby manuals tend to be less thorough than
they should be [...] It is my hope that both Ruby and this book will
serve to make your programming easy and enjoyable"
I hope you enjoy Classifier!
-Lucas Carlson
http://tech.rufy.com/
Instead try giving it a dozen or so sentences. You can limit how many
sentences end up in the summary with an optional parameter like this:
x = "This text deals with dogs. Dogs. This text involves dogs too.
Dogs! This text revolves around cats. Cats. This text also involves
cats. Cats! This text involves birds. Birds."
x.summary 2
Outputs:
"This text involves dogs too [...] This text also involves cats"
George