Openmp cannot be disabled

42 views
Skip to first unread message

Kassim Papa

unread,
May 25, 2024, 5:03:10 AMMay 25
to tesseract-ocr
Current Behavior :

Despite putting omp_thread_limit=1 tesseract still use all cores on my machine (i7-7th - windows 10).

We used this (at the beginning of the code) :
Environment.SetEnvironmentVariable("OMP_THREAD_LIMIT", "1");

And this ( a batch) :
@echo off
set OMP_THREAD_LIMIT=1
start "" "path_to_your_application.exe

We have 1 big image Tesseract takes 7 second when we go over it at once.

When we divide the image in 4 and run 4 instances of tesseract in parallel it take 7 second too : no changes at all.

Expected Behavior :

We should see in the task manager that tesseract only use 1 cores

there should be a significant improvement when running 4 images in parallel. Multiple people had success with this method.

Zdenko Podobny

unread,
May 25, 2024, 5:14:36 AMMay 25
to tesser...@googlegroups.com
How did you install tesseract?

What is the output of `tesseract -v`?


Zdenko


so 25. 5. 2024 o 11:03 Kassim Papa <kassi...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/008e9795-877c-4638-af08-0dc7e3af00ecn%40googlegroups.com.

Kassim Papa

unread,
May 25, 2024, 6:05:41 AMMay 25
to tesseract-ocr
We use a C# wrapper.
This .net library found on nugget : https://github.com/charlesw/tesseract

Zdenko Podobny

unread,
May 25, 2024, 6:12:21 AMMay 25
to tesser...@googlegroups.com
You need to replicate it with the tesseract executable if you want to claim it is Tesseract problem....

Zdenko


so 25. 5. 2024 o 12:05 Kassim Papa <kassi...@gmail.com> napísal(a):

Kassim Papa

unread,
May 25, 2024, 6:27:59 AMMay 25
to tesseract-ocr
I do not claim anything.

Thank you for your proposition. We will test that and post on their github (charlessw, the guy who made the wrapper)

Stephan weil closed the issue on the github of tesseract saying : 

"The Tesseract for Windows which is provided by UB Mannheim does not have this issue: it runs always single-threaded because it was built with OpenMP disabled. You did not say what Tesseract binary and which version you used."

So I guess you must be right, this is where our effort should go.

But I didn't even know that, I don't understand all those openmp changes that have been made. COuld you explain them to me? Since the issue is closed I cannot talk to stephan weil anymore.

Zdenko Podobny

unread,
May 25, 2024, 6:50:23 AMMay 25
to tesser...@googlegroups.com
Well, I would suggest making a replicable case that prove the problem if you want the help.
Based on the description you provided nobody can help you (neither charlessw) 
The problem could be somewhere in your code, in C#, in the tesseract, or even in your environment/OS... 
You observed the problem => you need to narrow down where is the source of the problem. 


Zdenko


so 25. 5. 2024 o 12:28 Kassim Papa <kassi...@gmail.com> napísal(a):

Kassim Papa

unread,
May 25, 2024, 6:54:29 AMMay 25
to tesseract-ocr
Yeah sure,

We just never were able to get it to work no matter what. But now we have some lead. We'll try replicating it with the tesseract executable.
But for example just knowing that openmp is disabled in tesseract was huge. And I'd like charlessw to answer us on that front.

Ger Hobbelt

unread,
May 27, 2024, 6:08:00 AMMay 27
to tesseract-ocr
For what it's worth, I ran into the same issue on the same platform (ms windows) about 2 years ago. Do note however that this was using my own tesseract build!

My investigation then showed OpenMP, once triggered to start, would run my 16 cores at 100% forever, irrespective of any work done. Using a sampling profiler I found OpenMP was simply running 16 threads were every thread that wasn't given any work at that moment was idling by spinning, checking the work queue, thus running the CPU to max temp without being very useful.

This experience with OpenMP at the time was in line with what I observed OpenMP doing with other test applications on my machine (not built by me): CPU nice & quiet until #openmp pragma is hit, then *BAM!* CPU maximg out all cores until end of application. The stuff that can run multithreaded does, but that's always only part of the code / run-time, but OpenMP kept my cores at max throttle by spinning during the intermissions, until the application is terminated... so the preliminary conclusion was it was an issue inside OpenMP (or me missing non-obvious setting XYZ for OpenMP). Anyway, I booted OpenMP off my system and went back to doing multi threading old skool, which is sometimes hard but always felt more comfortable to me.

Take-away: if you want to investigate what happens over at yours, grab a sampling profiler (I used a commercial one from Intel at the time IIRC) and build from C/C++ source (or other means to get legible function names from debug info in the profiler run reports), e.g. using Visual Studio. Its work, its effort, but nobody else can look into your box(es) so you'll otherwise always depend on others' guesswork.



Reply all
Reply to author
Forward
0 new messages