LibTesseract memory usage limit

308 views
Skip to first unread message

Max Zheltv

unread,
Oct 10, 2023, 5:23:43 AM10/10/23
to tesseract-ocr
Hi!

Is it possible to limit RAM that libtesseract uses for OCR? Or how can I know the max amount of RAM that it can allocate during the recognition?

I'm using libtesseract C++ API via bytedeco JavaCPP wrapper in my Java app on linux-x86_64 platform in Docker container.
The usage is quite simple:
- init TessBaseAPI with eng language
- read image using leptonica.pixReadMem
- set the image to TessBaseApi
- get the text via GetUTF8Text
- destroy image object and End the TessBaseAPI
(that's almost like the example here)

During the testing I run it in a loop for a list of images.
So I noticed that tesseract allocates 500-600MB of RAM right on the start and than it grows up to 2GB. It releases some of the memory at some random time but never release all memory while the process is running (even without any load).

Could anyone explain how tesseract uses RAM and if it is possible to control it?
Reply all
Reply to author
Forward
0 new messages