Tesseract privacy statement

1,270 views
Skip to first unread message

Jonne Kurokallio

unread,
Nov 16, 2020, 2:50:19 AM11/16/20
to tesseract-ocr
Hi,

I was wondering what is Tesseracts position privacy wise? I'm working with data that has relatively high security classification, and I need to make sure Tesseract OCR does not save or send that data to anywhere. Is there a documented statement on privacy?

With kind regards,
Jonne Kurokallio

Jonne Kurokallio

unread,
Nov 23, 2020, 7:46:38 AM11/23/20
to tesseract-ocr
Hi,

Is there anyone that could please help me out with this one? 

With kind regards,
Jonne Kurokallio

Tom Morris

unread,
Nov 23, 2020, 9:44:05 PM11/23/20
to tesseract-ocr
On Monday, November 16, 2020 at 2:50:19 AM UTC-5 Jonne Kurokallio wrote:

I was wondering what is Tesseracts position privacy wise? I'm working with data that has relatively high security classification, and I need to make sure Tesseract OCR does not save or send that data to anywhere. Is there a documented statement on privacy?

No, there isn't. Assertions like this typically come from an organization which will stand behind them. Tesseract is developed by Google and a loose confederation of independent open source developers which has no official organizational body.

If you want to maximize the security of your Tesseract instance, you should only use binaries that you build yourself or a party you trust and you should audit the source code to whatever level of due diligence you think is appropriate. If you're concerned about exfil through the network, you could run your OCR on a machine with no external network access. 

For what it's worth, I've never heard of anyone complain of Trojans in Tesseract, but that could just mean that they are well hidden.

Tom
Reply all
Reply to author
Forward
0 new messages