Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

extracting text from pdf files

0 views
Skip to first unread message

run...@fastmail.fm

unread,
Oct 31, 2006, 5:38:14 PM10/31/06
to
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.

pete...@gmail.com

unread,
Oct 31, 2006, 7:52:11 PM10/31/06
to
Hi,

Try the Xpdf project. Run the pdftotext command in the shell to produce
the text.

http://www.foolabs.com/xpdf/download.html

There's more tips at php.net/pdf.

runner7

unread,
Oct 31, 2006, 11:18:44 PM10/31/06
to

I really appreciate this lead, thanks, but can I do this all
programmatically without having to manually use a command line? I need
to process hundreds of pdf files to text and then extract what I need
from them.

Toby Inkster

unread,
Nov 1, 2006, 6:47:12 AM11/1/06
to
runner7 wrote:

> I really appreciate this lead, thanks, but can I do this all
> programmatically without having to manually use a command line? I need
> to process hundreds of pdf files to text and then extract what I need
> from them.

The system() function.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

Thomas Merz

unread,
Nov 1, 2006, 5:55:02 PM11/1/06
to
run...@fastmail.fm wrote:

> Can anyone help me with how to extract text from pdf files using PHP or
> ColdFusion? Thanks for any help.

Our TET product extracts the text from PDF. It contains a programming
interface for PHP (and other languages); you can directly
fetch the text (and coordinates, font, etc.) from your PHP
script. Free evaluation version on our Web site.

Thomas

_______________________________________________________________
Thomas Merz t...@pdflib.com http://www.pdflib.com
PDFlib 7: Create PDF/A for archiving, format tables, and more!
_______PDFlib - a library for generating PDF on the fly________

0 new messages