no activity for a month?

3 views
Skip to first unread message

Bill Janssen

unread,
Nov 18, 2009, 9:45:07 PM11/18/09
to ocropus
Odd to see no new messages on this group for the past few weeks...

Bill

Thomas Breuel

unread,
Nov 19, 2009, 5:53:39 AM11/19/09
to ocr...@googlegroups.com
Well, among other things, we haven't been updating the external
repository much. It's the quiet before the beta release, I suppose.

The status right now is:

-- Most of what we wanted to accomplish for beta has been
accomplished. There is still some code cleanup to be done, however.

-- We've developed a new set of labeling and transcription tools and
used it to label millions of training samples (those will be released
hopefully in a few months).

-- We're waiting for a large server to do very large model training on.

Cheers,
Tom

On Thu, Nov 19, 2009 at 03:45, Bill Janssen <bill.j...@gmail.com> wrote:
> Odd to see no new messages on this group for the past few weeks...
>
> Bill
>
> --
>
> You received this message because you are subscribed to the Google Groups "ocropus" group.
> To post to this group, send email to ocr...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/ocropus?hl=.
>
>
>

74yrs old

unread,
Nov 19, 2009, 8:01:42 AM11/19/09
to ocr...@googlegroups.com
  Reg:"We've developed a new set of labeling and transcription tools and

used it to label millions of training samples "
Whether said developed tools will support UTF-8 as well as Indic lang?
-sriranga(77yrsold)

Thomas Breuel

unread,
Nov 20, 2009, 3:12:33 AM11/20/09
to ocr...@googlegroups.com
Yes, the tools support Unicode, ligatures, linked scripts, and Indic languages.

Tom

74yrs old

unread,
Nov 20, 2009, 3:54:13 AM11/20/09
to ocr...@googlegroups.com
Tom,
Thanks for the valuable information.Trust beta will be released by the end of Dec.
-sriranga(77yrsold)

To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.

Bill Janssen

unread,
Nov 20, 2009, 12:32:41 PM11/20/09
to ocropus
Looking forward to it...

Bill

On Nov 20, 12:12 am, Thomas Breuel <tmb...@gmail.com> wrote:
> Yes, the tools support Unicode, ligatures, linked scripts, and Indic languages.
>
> Tom
>
> On Thu, Nov 19, 2009 at 14:01, 74yrs old <withblessi...@gmail.com> wrote:
> >   Reg:"We've developed a new set of labeling and transcription tools and
> > used it to label millions of training samples "
> > Whether said developed tools will support UTF-8 as well as Indic lang?
> > -sriranga(77yrsold)
>
> > On Thu, Nov 19, 2009 at 4:23 PM, Thomas Breuel <tmb...@gmail.com> wrote:
>
> >> Well, among other things, we haven't been updating the external
> >> repository much.  It's the quiet before the beta release, I suppose.
>
> >> The status right now is:
>
> >> -- Most of what we wanted to accomplish for beta has been
> >> accomplished.  There is still some code cleanup to be done, however.
>
> >> -- We've developed a new set of labeling and transcription tools and
> >> used it to label millions of training samples (those will be released
> >> hopefully in a few months).
>
> >> -- We're waiting for a large server to do very large model training on.
>
> >> Cheers,
> >> Tom
>
> >> On Thu, Nov 19, 2009 at 03:45, Bill Janssen <bill.jans...@gmail.com>

Bill Janssen

unread,
Jan 13, 2010, 4:44:52 PM1/13/10
to ocropus
How's the access to that very large server to do the model training
on, coming?

Bill

Tom

unread,
Jan 13, 2010, 6:05:37 PM1/13/10
to ocr...@googlegroups.com
Hi,

I should probably have sent an update...

The hardware for training is on order and should be delivered over the
next couple of weeks. We've also been busy working on additional
interactive tools for training and quality control and improving the
Python bindings. The way it looks now, you'll be able to run the
complete OCR pipeline from Python, which should make it much easier for
people to customize things.

We're using the interactive tools to identify sources of recognition
errors, which is also taking its time. Basically, with the training and
debugging, we want to get competitive performance on UNLV data without
adaptation.

Tom

Bill Janssen

unread,
Jan 15, 2010, 11:30:21 AM1/15/10
to ocropus
No problem, happy to remind you :-). Nice to hear I'll be able to
drive it directly from UpLib's Python code.

Bill

Reply all
Reply to author
Forward
0 new messages