import cv2
import pytesseract
from pytesseract import Output
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
img=cv2.imread('images/invoice-sample.jpg')
d=pytesseract.image_to_data(img,output_type=Output.DICT)
print(d.keys)
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com.
(base) PS C:\Users\Supharerk\ocr_server> & 'C:\Program Files\Tesseract-OCR\tesseract.exe' .\images\invoice-sample.jpg invoice-sample
Tesseract Open Source OCR Engine v5.0.0-alpha.20200223 with Leptonica
(base) PS C:\Users\Supharerk\ocr_server> pipenv run python .\app2.py
Traceback (most recent call last):
File ".\app2.py", line 10, in <module>
d=pytesseract.image_to_data(img,output_type=Output.DICT)
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 426, in image_to_data
}[output_type]()
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 424, in <lambda>
Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1),
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 264, in run_and_get_output
return output_file.read().decode('utf-8').strip()
File "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py", line 119, in __exit__
next(self.gen)
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 176, in save
cleanup(f.name)
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 136, in cleanup
raise e
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 133, in cleanup
remove(filename)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_y3d570lt'
Can you replicate problem with command line /"pure" tesseract? e,g,'C:\\Program Files\\Tesseract-OCR\\tesseract.exe' images/invoice-sample.jpg invoice-sampleZdenko
pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):
--I'm new to tesseract and trying to follow tutorial on Windows 10 using the code below
import cv2
import pytesseract
from pytesseract import Output
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
img=cv2.imread('images/invoice-sample.jpg')
d=pytesseract.image_to_data(img,output_type=Output.DICT)
print(d.keys)The problem is, I keep getting error PermissionError: [WinError 5] Access is denied: 'from implementing image_to_data and image_to_string in Windows 10.Only resource I found in stackoverflow is to set tesseract_cmd, PATH and TESSDATA_PREFIX which did not work for me. Not even using the administrative cmd works.After spending a couple hours I found setting permission for tesseract.exe (right click, select property and go to security tab) by checking Full control and Modify below to make it works.Hope this will help some people strugglingthe same problem.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com.
>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
>>> pytesseract.get_tesseract_version()
LooseVersion ('5.0.0-alpha.20200223')
This means there is problem with pytesseract/python permissions.Can you get output for pytesseract.get_tesseract_version()?Zdenko
so 29. 2. 2020 o 12:10 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com.
import tempfile
import cv2
import pytesseract
from PIL import Image
from pytesseract import Output
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'img = cv2.imread('images/invoice-sample.jpg')
# check temp file
temp_file = tempfile.NamedTemporaryFile(prefix='tess_')
print(temp_file.name)
image = Image.fromarray(img)
image.save(temp_file.name + '.png', format='png', **image.info)
temp_file.close()
if img.any():
print("Image shape:", img.shape)
data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(data_dict['level'])
for i in range(n_boxes):
(x, y, w, h) = (data_dict['left'][i], data_dict['top'][i], data_dict['width'][i], data_dict['height'][i])
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 125, 125), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
else:
print("Can not open input file")
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com.
PS C:\Users\Supharerk\ocr_server> pipenv run python .\test_tess.py
C:\Users\SUPHAR~1\AppData\Local\Temp\tess_g9e7avw0
Image shape: (1150, 835, 3)
Traceback (most recent call last):
File ".\test_tess.py", line 19, in <module>
data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 426, in image_to_data
}[output_type]()
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 424, in <lambda>
Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1),
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 264, in run_and_get_output
return output_file.read().decode('utf-8').strip()
File "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py", line 119, in __exit__
next(self.gen)
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 176, in save
cleanup(f.name)
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 136, in cleanup
raise e
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 133, in cleanup
remove(filename)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_69cggzq3'
so 29. 2. 2020 o 19:04 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com.
finally:
cleanup(f.name)
finally:
f.close()
cleanup(f.name)
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com.
@contextmanagerdef save(image): try: with NamedTemporaryFile(prefix='tess_',delete=False) as f: if isinstance(image, str): yield f.name, realpath(normpath(normcase(image))) return
Hello,I am not able to reproduce error, errors come from here [1] where pytesseract tries to cleanup temporary files.You should report it to pytesseract project as there is no option to skip this code.Maybe you can try to modify this part of pytesseact code[2]:finally:
cleanup(f.name)tofinally:
f.close()
cleanup(f.name)Zdenko
ne 1. 3. 2020 o 14:11 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e4e2b3f1-5201-4ddb-adbb-b810570ce7d1%40googlegroups.com.