WinError 5 PermissionError on Windows 10

1,968 views
Skip to first unread message

Supharerk Thawillarp

unread,
Feb 28, 2020, 2:31:07 PM2/28/20
to tesseract-ocr

I'm new to tesseract and trying to follow tutorial on Windows 10 using the code below

import cv2
import pytesseract
from pytesseract import Output
pytesseract
.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'


img
=cv2.imread('images/invoice-sample.jpg')


d
=pytesseract.image_to_data(img,output_type=Output.DICT)




print(d.keys)




The problem is, I keep getting error PermissionError: [WinError 5] Access is denied: 'from implementing image_to_data and image_to_string in Windows 10.

Only resource I found in stackoverflow is to set tesseract_cmd, PATH and TESSDATA_PREFIX which did not work for me. Not even using the administrative cmd works. 

After spending a couple hours I found setting permission for tesseract.exe (right click, select property and go to security tab) by checking Full control and Modify below to make it works.

Hope this will help some people strugglingthe same problem.


1582917756731.jpg1582917788913.jpg


Zdenko Podobny

unread,
Feb 29, 2020, 4:19:41 AM2/29/20
to tesser...@googlegroups.com
Can you replicate problem with command line /"pure" tesseract? e,g,
'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'   images/invoice-sample.jpg invoice-sample

Zdenko


pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <raynus....@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com.

Supharerk Thawillarp

unread,
Feb 29, 2020, 6:10:41 AM2/29/20
to tesseract-ocr
No, the tesserect successfully run with output generated in textfile.

(base) PS C:\Users\Supharerk\ocr_server> & 'C:\Program Files\Tesseract-OCR\tesseract.exe' .\images\invoice-sample.jpg invoice-sample
Tesseract Open Source OCR Engine v5.0.0-alpha.20200223 with Leptonica



However, the WinError 5 arise again when running from python (with pipenv)
(base) PS C:\Users\Supharerk\ocr_server> pipenv run python .\app2.py
Traceback (most recent call last):
 
File ".\app2.py", line 10, in <module>
    d
=pytesseract.image_to_data(img,output_type=Output.DICT)
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 426, in image_to_data
   
}[output_type]()
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 424, in <lambda>
   
Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1),
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 264, in run_and_get_output
   
return output_file.read().decode('utf-8').strip()
 
File "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py", line 119, in __exit__
   
next(self.gen)
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 176, in save
    cleanup
(f.name)
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 136, in cleanup
   
raise e
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 133, in cleanup
    remove
(filename)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_y3d570lt'






เมื่อ วันเสาร์ที่ 29 กุมภาพันธ์ ค.ศ. 2020 16 นาฬิกา 19 นาที 41 วินาที UTC+7, zdenop เขียนว่า:
Can you replicate problem with command line /"pure" tesseract? e,g,
'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'   images/invoice-sample.jpg invoice-sample

Zdenko


pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):

I'm new to tesseract and trying to follow tutorial on Windows 10 using the code below

import cv2
import pytesseract
from pytesseract import Output
pytesseract
.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'


img
=cv2.imread('images/invoice-sample.jpg')


d
=pytesseract.image_to_data(img,output_type=Output.DICT)




print(d.keys)




The problem is, I keep getting error PermissionError: [WinError 5] Access is denied: 'from implementing image_to_data and image_to_string in Windows 10.

Only resource I found in stackoverflow is to set tesseract_cmd, PATH and TESSDATA_PREFIX which did not work for me. Not even using the administrative cmd works. 

After spending a couple hours I found setting permission for tesseract.exe (right click, select property and go to security tab) by checking Full control and Modify below to make it works.

Hope this will help some people strugglingthe same problem.


1582917756731.jpg1582917788913.jpg


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.

Zdenko Podobny

unread,
Feb 29, 2020, 12:21:26 PM2/29/20
to tesser...@googlegroups.com
This means there is problem with pytesseract/python permissions.

Can you get output for pytesseract.get_tesseract_version()?

Zdenko


so 29. 2. 2020 o 12:10 Supharerk Thawillarp <raynus....@gmail.com> napísal(a):
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com.

Supharerk Thawillarp

unread,
Feb 29, 2020, 1:04:14 PM2/29/20
to tesseract-ocr
Sure

>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
>>> pytesseract.get_tesseract_version()
LooseVersion ('5.0.0-alpha.20200223')


เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 0 นาฬิกา 21 นาที 26 วินาที UTC+7, zdenop เขียนว่า:
This means there is problem with pytesseract/python permissions.

Can you get output for pytesseract.get_tesseract_version()?

Zdenko


so 29. 2. 2020 o 12:10 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):

Zdenko Podobny

unread,
Feb 29, 2020, 3:05:39 PM2/29/20
to tesser...@googlegroups.com
1. Make sure you have the latest version of tesseract.
Then try this script and provide exact/full error message:
import tempfile

import cv2
import pytesseract
from PIL import Image

from pytesseract import Output

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
img = cv2.imread('images/invoice-sample.jpg')
# check temp file
temp_file = tempfile.NamedTemporaryFile(prefix='tess_')
print(temp_file.name)
image = Image.fromarray(img)
image.save(temp_file.name + '.png', format='png', **image.info)
temp_file.close()

if img.any():
print("Image shape:", img.shape)
data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(data_dict['level'])
for i in range(n_boxes):
(x, y, w, h) = (data_dict['left'][i], data_dict['top'][i], data_dict['width'][i], data_dict['height'][i])
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 125, 125), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
else:
print("Can not open input file")



Zdenko


so 29. 2. 2020 o 19:04 Supharerk Thawillarp <raynus....@gmail.com> napísal(a):
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com.

Supharerk Thawillarp

unread,
Mar 1, 2020, 8:10:57 AM3/1/20
to tesseract-ocr
ok, it gave me WinErr5 again.


PS C
:\Users\Supharerk\ocr_server> pipenv run python .\test_tess.py
C
:\Users\SUPHAR~1\AppData\Local\Temp\tess_g9e7avw0
Image shape: (1150, 835, 3)

Traceback (most recent call last):

 
File ".\test_tess.py", line 19, in <module>
    data_dict
= pytesseract.image_to_data(img, output_type=Output.DICT)

 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 426, in image_to_data
   
}[output_type]()
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 424, in <lambda>
   
Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1),
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 264, in run_and_get_output
   
return output_file.read().decode('utf-8').strip()
 
File "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py", line 119, in __exit__
   
next(self.gen)
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 176, in save
    cleanup
(f.name)
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 136, in cleanup
   
raise e
 
File "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py", line 133, in cleanup
    remove
(filename)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_69cggzq3'





เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 3 นาฬิกา 05 นาที 39 วินาที UTC+7, zdenop เขียนว่า:
so 29. 2. 2020 o 19:04 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):

Zdenko Podobny

unread,
Mar 1, 2020, 11:40:26 AM3/1/20
to tesser...@googlegroups.com
Hello,

I am not able to reproduce error, errors come from here [1] where pytesseract tries to cleanup temporary files.
You should report it to pytesseract project as there is no option to skip this code.
Maybe you can try to modify this part of pytesseact code[2]:
finally:
cleanup(f.name)
to 
finally:
f.close()
cleanup(f.name)

 
Zdenko


ne 1. 3. 2020 o 14:11 Supharerk Thawillarp <raynus....@gmail.com> napísal(a):
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com.

Supharerk Thawillarp

unread,
Mar 1, 2020, 12:17:06 PM3/1/20
to tesseract-ocr
After diving in pytesseract.py I found one possible related issue in the NamedTemporaryFile.

According to the post in stackoverflow (https://stackoverflow.com/questions/55081022/python-tempfile-with-a-context-manager-on-windows-10-leads-to-permissionerror), I added the delete=False argument in the NamedTemporaryFile function in pytesseract.py.


@contextmanager
def save(image):
    try:
        with NamedTemporaryFile(prefix='tess_',delete=False) as f:
            if isinstance(image, str):
                yield f.name, realpath(normpath(normcase(image)))
                return



It's working since then.

I will forward this thread and issue to pytesseract. 

Thanks for you help.







เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 23 นาฬิกา 40 นาที 26 วินาที UTC+7, zdenop เขียนว่า:
Hello,

I am not able to reproduce error, errors come from here [1] where pytesseract tries to cleanup temporary files.
You should report it to pytesseract project as there is no option to skip this code.
Maybe you can try to modify this part of pytesseact code[2]:
finally:
cleanup(f.name)
to 
finally:
f.close()
cleanup(f.name)

 
Zdenko


ne 1. 3. 2020 o 14:11 Supharerk Thawillarp <raynus...@gmail.com> napísal(a):

Zdenko Podobny

unread,
Mar 1, 2020, 12:19:20 PM3/1/20
to tesser...@googlegroups.com
anyway report it to pytesseract project, so it can be fixed - otherwise next update will bring it once again.

Zdenko


ne 1. 3. 2020 o 18:17 Supharerk Thawillarp <raynus....@gmail.com> napísal(a):
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e4e2b3f1-5201-4ddb-adbb-b810570ce7d1%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages