How to enable batched inference in tesseract-ocr

46 views
Skip to first unread message

Vishnu Madhu

unread,
Apr 27, 2023, 4:59:59 AM4/27/23
to tesseract-ocr
I am exploring ways to improve the performance(single-thread / multi-thread) of tesseract-ocr inference. Performance profiling tess-ocr inference (CLI) using Intel Vtune showed very little usage of the CPU's vector registers (AVX-2 and 512). 
The default inference using tesseract works on a single image (batch_size=1). I think batched inference would drive better utilization of the vector registers and thus improve the inference throughput of tess-ocr. 
Is there a way to enable batched inference on tess-ocr ?

Thanks in advance

Regards
Vishnu


other details:
- system : Intel Xeon 8380 (Icelake), Ubuntu 22.04(5.15), GCC 11.3
- tesseract : built from source (5.3.0), tessdata
- configure flags : --enable-float32 --disable-opencl --disable-graphics 'CXXFLAGS=-O3 -mavx512f -mfma'

other performance observations:
- considerable openmp pause times observed for the multi threaded runs
- LSTM --> Tanh lookup times are considerably high (~ 50ms per lookup)

Reply all
Reply to author
Forward
0 new messages