OpenOCR - run your own Tesseract-OCR-as-a-Service

1,376 views

Skip to first unread message

Traun Leyden

unread,

May 16, 2014, 6:06:53 PM5/16/14

to tesser...@googlegroups.com

Hi all,

I want to announce OpenOCR:

https://github.com/tleyden/open-ocr

It has everything you need to run Tesseract in the cloud behind a REST API, and it was designed with horizontal scalability in mind.

I created this because I wanted to integrate OCR functionality in a mobile app, but I wanted the image processing to happen in the cloud rather than on the device, and I couldn't find any free/cheap OCR-as-a-service providers. So I figured I'd build my own.

I decided to open source everything, including all the glue code that wraps Tesseract and handles the RabbitMQ job queuing. I'm hoping that other people will find bugs and contribute fixes, which will help improve the codebase for everyone using it.

It uses docker for the containerization / virtualization, so it can run on docker-aware PaaS's, or on Amazon EC2 machines running docker.