pdfextract a cool ruby tool that uses pdf-reader

123 views
Skip to first unread message

Dominic Sisneros

unread,
Oct 2, 2012, 1:42:16 PM10/2/12
to prawn...@googlegroups.com
I was looking for ruby pdf tools and came across this on github

https://github.com/CrossRef/pdfextract.git

It uses pdf-reader to analyzethe spatial information of text as well as heuristics to
extract --regions or --columns --headers etc from a pdf file. 


Another cool ruby pdf library is origami which uses its own library to parse pdf-files. This one allows you to modify the pdf file





Reply all
Reply to author
Forward
0 new messages