Hi,
On 14/02/2023 19:10, Flávio. wrote:
> Sorry, how can I do that? I'm trying to send image binary data, not a
> path. The goal is to not write a file to disk and use only memory. Could
> you please write a code that sends the data (binary) to the stdin of
> tesseract? it can be in Python, Dart or Java :( I've tried ChatGPT but
> it is wrong and gets lost
Normally I'd say 'left as an exercises to the reader' but I so happen to
have a snippet around that ought to give you a general idea.
This uses io.BytesIO in Python 3 to save the image (stream) to, it
contains an uncompressed PNG (compression will just slow things down).
It assumes that the variable "pil_image" contains a PIL.Image object.
The code to use just one core in Tesseract is of course entirely
optional. I didn't *test* this to work (I modified it a bit - it works
in another setting), but it should work in theory:
> with io.BytesIO() as output:
> pil_image.save(output, format='PNG', compress=0, compress_level=0)
> output.seek(0)
>
> # Let's just use one core in tesseract
> env = os.environ.copy()
> env['OMP_THREAD_LIMIT'] = '1'
>
> p = subprocess.Popen(['tesseract', '-', '-'],
> stdin=subprocess.PIPE,
> stdout=subprocess.PIPE,
> stderr=subprocess.PIPE,
> env=env)
> output, stderr = p.communicate(output.read())
> stderr = stderr.decode('utf-8')
>
> if stderr:
> logger.warning('tesseract_baselines stderr: %s', stderr)
Regards,
Merlijn