tesseract-ocr: Installed: 4.1.1-2.1
convert, provided by imagemagick: Installed: 8:6.9.11.60+dfsg-1.3 (It could also be an issue with convert, but I've converted the PDF with GIMP, but get the same results.)
My OS is Linux, Debian Bullseye (stable)
I execute the script by
$ ./PDF2SearchablePDF.sh Sh ShockDataMeasurementsLessonsLearned.pdf
The source PDF
ShockDataMeasurementsLessonsLearned.pdf
Split PDF pg 1
PDFIn001.pdf
Split PDF pg1 converted to .tiff with convert (imagemagick)
PDFIn001.tiff
Pg 1 after processing with tesseract
PDFIn001Searchable.pdf
Bash script:
###
#!/bin/bash
SourcePDF=$1
mkdir PDFIn PDFOut TIFFIn
pdfseparate $SourcePDF PDFIn/PDFIn%03d.pdf
#pdfseparate InputDoc02.pdf PDFIn/PDFIn%03d.pdf
echo $1
cd PDFIn
ls PDFIn*.pdf >../list.txt
cd ..
for FIL in $(<list.txt)
do
convert -density 300 PDFIn/${FIL} TIFFIn/${FIL/.pdf/}.tiff
#gs -q -dNOPAUSE -r300x300 -sDEVICE=tiff32nc -sOutputFile=TIFFIn/${FIL/.pdf/}.tiff PDFIn/${FIL} -c quit
tesseract TIFFIn/${FIL/.pdf/}.tiff PDFOut/${FIL/.pdf/} -l eng pdf
done
pdfunite PDFOut/PDFIn*.pdf OutputPDF.pdf
###