No Extracted Text from PDF

151 views
Skip to first unread message

Mike Perry

unread,
May 8, 2018, 1:58:25 PM5/8/18
to ResourceSpace
I have the following in config.php:

$pdftotext_path='/usr/bin';

pdftotext is in /usr/bin ...

pdftotext does not show in the Installation Check, and there is no extracted text from uploaded PDFs (which have selectable text). What am I missing??

Mike

Yair Krauze

unread,
May 8, 2018, 5:48:18 PM5/8/18
to ResourceSpace
Check your log files.  My installation does not show pdftotext in the installation check either, but everything works fine.

Mike Perry

unread,
May 8, 2018, 7:37:53 PM5/8/18
to ResourceSpace
Thanks Yair -- There's this entry in the log --

Extract

    resource    ref    68   
2018-05-08 19:28:13    admin    Transformed file

But no extract. Also no "mention"of calling pdftotext. I don't know what the log should say when it does work. . . 

afatac

unread,
Jun 13, 2019, 5:28:17 AM6/13/19
to ResourceSpace
I am encountering the same issue. No text is extracted. Have you solved yours?

New installation. RS 9.0 running on Windows server 2016 IIS.

This field is untounced in config.default.php
$extracted_text_field=72;

In config.php
$pdftotext_path='E:\pdftotext';

PDF previews are created successfully. Not sure if this info is related - unoconv is used and Office document previews are created successfully.

Debug log is enabled, but I didn't see any calling of pdftotext.exe when the pdf files are uploaded and previews are created.

Mike Perry

unread,
Jul 24, 2019, 8:22:33 PM7/24/19
to ResourceSpace
I did solve it -- but I don't recall how.

Version 9.0 SVN.

Let me think on it. . .

Mike Perry

unread,
Jul 24, 2019, 8:39:39 PM7/24/19
to ResourceSpace
Make sure pdftotext exists in your usr/bin folder. . .

It appears I may have asked my hosting company to add it, based on the date of the file. . .


On Thursday, June 13, 2019 at 5:28:17 AM UTC-4, afatac wrote:
Message has been deleted

afatac

unread,
Jul 24, 2019, 8:52:25 PM7/24/19
to ResourceSpace
Thanks for the reply. I have solved the issue but not knowing the exact reason :-)

It is set up in a Windows Server. Probably has something to do with permission of application pool identity and anonymous authentication in IIS.

Duane Mitchell

unread,
May 19, 2020, 9:32:53 AM5/19/20
to ResourceSpace
Mike are you on a shared server of a dedicated server? I'd like to ask my hosting company to add this but I'm on a shared server and they may not want to do that.

Mike Perry

unread,
May 19, 2020, 1:28:11 PM5/19/20
to ResourceSpace

Shared right now.. . .
Reply all
Reply to author
Forward
0 new messages