caffe model for OCR

2,343 views
Skip to first unread message

Abhinav Kumar Gupta

unread,
Sep 19, 2015, 1:31:01 AM9/19/15
to Caffe Users
Hi all,

We were looking for any CNN caffe model for text spotting/text segmentation step for OCR pipeline. Any leads for that ? If anyone can share a starting point either training data or a model, it shall be great.

Thanks a lot in advance.

Thanks,
Abhinav


Seb Testeau

unread,
Sep 28, 2015, 9:57:37 AM9/28/15
to Caffe Users
Hello,

I found this project, but there is no text spotting and it's at a character level. https://github.com/pannous/caffe-ocr

There is a lstm neural network implementation here tho from the same guy that made ocropy: https://github.com/tmbdev/clstm

I am curious to see what you end up using. We for now decided to use Tesseract. 

seb

Abhinav Kumar Gupta

unread,
Sep 29, 2015, 7:51:01 AM9/29/15
to Seb Testeau, Caffe Users
Hi Seb,

We are trying to replicate results by Max Jaderberg. Here is the link http://www.robots.ox.ac.uk/~vgg/research/text/

We havent been successful as of yet, but we hope we shall be able to develop something similar in next 2-3 weeks. 

We would appreciate alll kinds of guidance and help in the process. 

Thanks,
Abhinav

--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/x7RjZPNcaxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/fa13203a-948f-40bd-887b-38a2b122de97%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Seb Testeau

unread,
Sep 30, 2015, 12:06:15 PM9/30/15
to Caffe Users, stes...@gmail.com
Hi Abhinav,

I just installed the project and I tested it on some of our images. 

The text spotting seems very powerful, but how is it better then Real-Time Scene Text Localization and Recognition, CVPR 2012 ?

I wouldn't be able to justify adapting this for a small improvement. 

As for the OCR, it returned gibberish in our use case (subtitle in movies). Tesseract work better in our use case. 

seb

Abhinav Kumar Gupta

unread,
Oct 1, 2015, 3:05:03 PM10/1/15
to Seb Testeau, Caffe Users
Thanks Seb,

Thanks for the link. We shall try both the approaches and test for our requirement. 

What is the best way to generate synthetic training data for text segmentation problems ? Can you suggest some tools for the same ?

Shall keep you updated on the progress. 

Thanks,
Abhinav

Pooja

unread,
Apr 4, 2016, 3:30:10 PM4/4/16
to Caffe Users, stes...@gmail.com
Hi Abhinav,

Is there any progress in replicating Jagerberg results?  Have you simulated the end-to-end framework?

I'm not able to figure out detail of which particular bigram each label signifies in bigram_net output layers. 
I guess that information is missing.

Also, I have used tesseract in past. Feeding a segmented word image gives fairly good results with tesseract.

Thanks,
Pooja

Corey Nolet

unread,
May 18, 2016, 3:31:15 PM5/18/16
to Caffe Users
I'm also wondering about this. I've read Jaderberg's dissertation and I'm going to be implement what I can from it as well but it would be nice to know if there's already an effort underway to do this so that I can redirect my efforts towards that instead of continually reinventing the wheel.
Reply all
Reply to author
Forward
0 new messages