Try the Xpdf project. Run the pdftotext command in the shell to produce
the text.
http://www.foolabs.com/xpdf/download.html
There's more tips at php.net/pdf.
I really appreciate this lead, thanks, but can I do this all
programmatically without having to manually use a command line? I need
to process hundreds of pdf files to text and then extract what I need
from them.
> I really appreciate this lead, thanks, but can I do this all
> programmatically without having to manually use a command line? I need
> to process hundreds of pdf files to text and then extract what I need
> from them.
The system() function.
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
> Can anyone help me with how to extract text from pdf files using PHP or
> ColdFusion? Thanks for any help.
Our TET product extracts the text from PDF. It contains a programming
interface for PHP (and other languages); you can directly
fetch the text (and coordinates, font, etc.) from your PHP
script. Free evaluation version on our Web site.
Thomas
_______________________________________________________________
Thomas Merz t...@pdflib.com http://www.pdflib.com
PDFlib 7: Create PDF/A for archiving, format tables, and more!
_______PDFlib - a library for generating PDF on the fly________