Training with word images part 2

22 views
Skip to first unread message

Raj Julha

unread,
Aug 9, 2011, 12:02:52 PM8/9/11
to ocropus
Hi

I have the following:
1. A full page handwritten image sample
2. The bounding boxes (coordinates) of each word on the page/image
3. A text file having the corresponding ground truth of each word on
the image.

My first attempt was to create an image snippet of each word and a
corresponding text file containing the word. I thought I'd use this as
input for training but couldn't proceed any further because the
ocroopus-align command doesn't seem to match the image/text. Should I
programmatically create the page level pseg file?
Or should I rather create an image per line (i have the necessary
coordinates per line). The reason I took this route was to avoid the
engine to incorrectly create line images as the handwritten text is
not a straght line.

I would appreciate any input on how I can proceed to train the engine.

Thanks

Raj
Reply all
Reply to author
Forward
0 new messages