--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ef745c36-3e31-42bc-b56e-c81bc5273f2e%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9f535c68-f7ed-48e9-9958-c18b98a2f451%40googlegroups.com.
from tesserocr import PyTessBaseAPI, PSM, OEM
from PIL import Image
with PyTessBaseAPI(psm=PSM.RAW_LINE, oem=OEM.LSTM_ONLY) as api:
image = Image.open("test.jpg")
api.SetImage(image)
text = api.GetUTF8Text()
Tesseract executable can read image data (not numpy!) from stdin and past them to stdout so at least IO operation can be avoided. Not sure if first part (reading from stdin) can be implemented in pytesseract, but for second part should be no problem.If somebody is looking for performance seriously, using tesseract executable is not good approach: each time you start tesseract it needs to initialize language model e.g. to read several Mb from disk which is especially with small image pure waste of resources.Tesseract >=4.1 support also compressed language model so this is can help too if disk IO operations are problem.Zdenko
st 30. 10. 2019 o 8:25 Juanjo Serrano Lloria <juanjo...@letsrebold.com> napísal(a):
--Hi,Perhaps a solution is to create a memory filesystem.https://docs.pyfilesystem.org/en/latest/reference/memoryfs.html
El miércoles, 30 de octubre de 2019, 6:28:57 (UTC+1), Ayush Pandey escribió:Hi,I want to run the trained tesseract model through a python script ( using PyTesseract for this purpose right now ). Is there a way by which I can pass a numpy array to Tesseract without saving it to the disk ( writing to disk is quite slow and time consuming ).Thanks and Regards,Ayush Pandey.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
api.SetImageBytes(raw_img.tobytes(), raw_img.shape[1], raw_img.shape[0], 1, raw_img.shape[1])
I recommend using this over pytesseract even if the installation sometimes may be a little more complex.
Lorenzo
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ef745c36-3e31-42bc-b56e-c81bc5273f2e%40googlegroups.com.