I've seen a few OCR on github and elsewhere while searching of python ML projects for OCR.
1.
I'll be trying it soon on a Manjaro VM.
If anyone had any experience with this OCR, please share your experiences here.
2.
This is good for English and Chinese. It claims to support Hindi, but my experience with scanned documents is not good.
3.
I haven't yet tried this. Reading details show that it may work for single lines only. I'm not sure though.
A blog by author is here.
4.
It works well for hindi, but results for Sanskrit are poor. May need training.
I'm tired of online solutions like Google Vision Or Google Drive. Having offline solutions will allow me to run OCR on whatever PDF I've without depending on Internet and I may use the data to edit/train/publish/blog/search/research etc.
I'll like to request affluent people/programmers/coders to test these options and write their experience/guide for others like us. If one is able to use/create training data for any of these OCR and get similar/better results as Google Vision, it will be far better.
One may then use tkinter, etc. to create a usable GUI or locally deploy-able solution with WebUI for elderly/non-tech-savvy Sanskrit scholars.