Hi there,
Going forward: yes, if you remove
pdftotext, then
this block of code that indexes the OCR layer should not activate, meaning your future PDF uploads will not be indexed.
For those already indexed in your system:
We could try to use SQL to delete the OCR text from the database. As always,
please proceed at your own risk and make a backup first!
If you want to delete ALL existing OCR transcripts from your AtoM database, try the following query:
- DELETE FROM property WHERE name='transcript' AND scope='Text extracted from source PDF file\'s text layer using pdftotext';
Hope that helps!