Training error when using ocropus-ctrain

46 views
Skip to first unread message

Anushka Rajapaksha

unread,
Jul 13, 2011, 7:15:36 AM7/13/11
to ocr...@googlegroups.com
I used the ocropus-ctrain to train the neural network for sinhala script. It gave the following error. Could you help me on this error?

anushka@anushka-Ideapad-Z460:~/ocro/ocropy$ ocropus-ctrain -K 10 chars.db -o chars.cmodel

classifier <ocrolib.mlp.AutoMlpModel instance at 0x939cf8c>
training...
=== chars.db
total 42
# determining per-class cutoff
classes 21 stats 2.0 2.0 2.0 2.0 ntrain 2
# sampling the classes
 A:2  C:2  D:2  T:2  b:2  d:2  e:2  g:2  h:2  i:2  j:2  k:2  l:2  m:2  n:2  p:2  r:2  s:2  t:2  v:2  y:2
# training 42

summary statistics:
[('A', 2), ('C', 2), ('D', 2), ('T', 2), ('b', 2), ('d', 2), ('e', 2), ('g', 2), ('h', 2), ('i', 2), ('j', 2), ('k', 2), ('l', 2), ('m', 2), ('n', 2), ('p', 2), ('r', 2), ('s', 2), ('t', 2), ('v', 2), ('y', 2)]

training 42 variants representing 42 training characters

starting classifier training
AutoMLP initial 0.408 20 4 1.0000
AutoMLP initial 0.175 40 4 1.0000
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-ctrain", line 311, in <module>
    for progress in classifier.updateModel1(verbose=1):
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 1788, in updateModel1
    for progress in self.classifier.train1(self.rows,array(self.classes,'i'),*args,**kw):
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/mlp.py", line 500, in train1
    samples=training)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/mlp.py", line 387, in train
    self.init(data,cls,nhidden=nhidden,eps=eps)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/mlp.py", line 325, in init
    self.w1 = array(data[selection(xrange(len(data)),nhidden)] * eps/scale,'f')
  File "/usr/lib/python2.7/random.py", line 320, in sample
    raise ValueError, "sample larger than population"
ValueError: sample larger than population

Raj Julha

unread,
Aug 24, 2011, 10:55:42 AM8/24/11
to ocropus
Hi

I had a similar issue but the error disappeared when I set value of
more classes. I suspect one of the classes doesn't have the minimum
number of occurrences. How did you fix your issue?

Cheers

Raj

Tom

unread,
Oct 9, 2011, 10:05:03 AM10/9/11
to ocr...@googlegroups.com
You have too few training examples.  Generally, with the MLP, you need tens of thousands of training examples total, and hundreds per class.
Reply all
Reply to author
Forward
0 new messages