I was looking for ruby pdf tools and came across this on github
https://github.com/CrossRef/pdfextract.gitIt uses pdf-reader to analyzethe spatial information of text as well as heuristics to
extract --regions or --columns --headers etc from a pdf file.
Another cool ruby pdf library is origami which uses its own library to parse pdf-files. This one allows you to modify the pdf file