OCRopus 0.6 released

906 views
Skip to first unread message

Tom

unread,
Aug 22, 2012, 7:33:41 PM8/22/12
to ocr...@googlegroups.com
I have tagged the released version of OCRopus 0.6.  Major changes relative to 0.5.x are much simpler installation, fewer dependencies, better character recognition rates, and easier training.  There is also a number of sample scripts illustrating recognition and training.

Thanks to everybody who provided feedback on prereleases.

For documentation and further information, please see the project page: www.ocropus.org

Tom

Sriranga(78yrsold)

unread,
Aug 22, 2012, 9:19:49 PM8/22/12
to ocr...@googlegroups.com
Tom,
Already re-installed version 0.6pre-4 yesterday by deleting ocropus folder of 0.6pre-3.
How to updated to version 0.6. It is presumed that existing ocropus folder have to be deleted and re-install from scratch? yesterday night I spent upto 2pm without sleep (usually at 10pm- go to bed). Kindly not to mistake me.
What is the commandline to be used for updating from the existing version to next version viz. ocropus 0.6 without necessity to re-install from the scaratch?.
Awaiting further instructions.
With warmest regards,
-sriranga(79yrs) - INDIA

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/784w6OSv3fMJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Sriranga(78yrsold)

unread,
Aug 22, 2012, 9:25:19 PM8/22/12
to ocr...@googlegroups.com
Is extract of IRC(freenode #ocropus)discussion available? If so, where?

Sriranga(78yrsold)

unread,
Aug 22, 2012, 9:44:26 PM8/22/12
to ocr...@googlegroups.com
I tried to update the existing ocropus folder containing 0.6pre-4 to ocropus 6.0 by using the commandline as follows:

dell@ubuntu:~$ cd ocropus
dell@ubuntu:~/ocropus$ sudo apt-get hg clone -r ocropus-0.6 https://code.google.com/p/ocropus
[sudo] password for dell:
E: Command line option 'r' [from -r] is not known.
dell@ubuntu:~/ocropus$
 I am newbie to linux - where i made mistake?
Awaiting valuable guidance.
w/b
-sriranga(79yrs)

Sriranga(78yrsold)

unread,
Aug 23, 2012, 2:23:04 AM8/23/12
to ocr...@googlegroups.com
Tom,

Without waiting for further guidance on my previous emails. I started to re-install ocropus 0.6 from scratch, after deleting  ocropus
folder containing 0.6-pre4. succeeded to run-test - vide extract of terminal0.6 attached herewith.
How to start next step of training procedure?

suggestion: it would be nice indicate the commandline to be used by new users/newbies - for the first time installing ocropus. and
commandline ( to update the existing ocropus to latest version of ocropus) to be used by old users - who had already installed ocropus

Awaiting further guidance.
-sriranga(79yrs)
extract of terminal0.6

Tom

unread,
Aug 23, 2012, 5:20:20 AM8/23/12
to ocr...@googlegroups.com
Hi,

the next step is to follow the example in "fraktur-boxes" and do something similar for your script.  You can re-use Tesseract training files you have.

As for how to update, we're still just using Mercurial for distribution, so if you want to read more, have a look at Mercurial tutorials, like: http://mercurial.selenic.com/wiki/Tutorial

Tom

Sriranga(78yrsold)

unread,
Aug 23, 2012, 7:38:03 AM8/23/12
to ocr...@googlegroups.com, Aravinda VK
Tom,
Kindly view attached text file. and confirm whether what I have done is in order. In the folder of Fraktur-boxes, folder named "deu-f"
created.
 I am running script for uw3-500 - it has created "book" folder  and also text file as "book.txt" - There is error message at the bottom of the attached extract of terminal run-uw3-500 - which i could not understand - what to do?

Now started to run script run-uw3-500-parallel. now finished. attached extract of terminal -uw 3-500 parallel -with screenshot attached.
It is noticed that in the terminal there are errors displayed. I may kindly be informed whether what I have done is correct or not.
Unable to open book.h5 to view the contents.
what next to do?

It is also observed that the examples are of bigger size which consumes more time - which will discourage the users.
It would nice, if the examples are created in minimum  size page for the purpose of  first hand understanding  the concept of the programs by users/newbies.
With Warmest regards,
-sriranga(79yrs)



To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/b3PL9Hi82UcJ.
extract of terminal -run-box-training
extract of terminal run-uw3-500
extract of terminal run-uw3-500 parallel
Screenshot from 2012-08-23 16:54:38.png

stinger

unread,
Aug 23, 2012, 9:21:28 AM8/23/12
to ocr...@googlegroups.com
Tom,

Got the latest version of ocropus, reran run-uw3-500, and it fails here:

+ true
+ true extract character shapes into the HDF5 book.h5 database
+ true you can look at this database with ocropus-cedit book.h5
+ true
+ rm -rf book.h5
+ ocropus-lattices --extract book.h5 'book/????/??????.png'
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-lattices", line 56, in <module>
    args.files = ocrolib.glob_all(args.files)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/toplevel.py", line 204, in argument_checks
    result = f(*args,**kw)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 509, in glob_all
    raise Exception("%s: expansion did not yield any files"%arg)
Exception: book/????/??????.png: expansion did not yield any files

In the directory I have the following files:

010001.aligned   010001.bin.png   010001.gt.txt   010001.rseg.png  010001.xheight
010001.baseline  010001.cseg.png  010001.lattice  010001.txt

Tom

unread,
Aug 23, 2012, 9:41:06 AM8/23/12
to ocr...@googlegroups.com
Thanks; I fixed the file name patterns in the script.

Sriranga(78yrsold)

unread,
Aug 23, 2012, 11:41:10 AM8/23/12
to ocr...@googlegroups.com
I rerun corrected run-uw3-500 - error displayed as follows;
book/1051/010039.bin.png =EXTRACTED= 25. E. Skordalakis, ''DGCAS as a Microprogram Develop-
book/1059/010098.bin.png =EXTRACTED= true for cases where all coordinates have
book/1061/010055.bin.png =EXTRACTED= borhood image transformations, lose much of their
book/1064/010095.bin.png =EXTRACTED= histogram. The cumulative histogram contains, in
book/1067/010058.bin.png =EXTRACTED= processors.
book/1069/010008.bin.png =EXTRACTED= 4.2. Hough transformation using the bimodal
book/1069/010107.bin.png =EXTRACTED= the adjacency list for region k. It has been con-
book/1070/010027.bin.png =EXTRACTED= 5.2. Architectural tradeoffs
book/1071/010031.bin.png =EXTRACTED= problems.
book/1071/010061.bin.png =EXTRACTED= Image Database Management, Miami Beach, FL, 18
book/1071/010112.bin.png =EXTRACTED= 20. S. L. Tanimoto and J. J. Pfeiffer, Jr. An image pro
book/1074/010040.bin.png =EXTRACTED= by the local planner to achieve current goals.
book/1074/010051.bin.png =EXTRACTED= approach to local vehicle control.
book/1075/010005.bin.png =EXTRACTED= vehicle. We have investigated the fusion of data from
book/1077/010059.bin.png =EXTRACTED= another paper in these proceedings [2].
book/1078/010076.bin.png =EXTRACTED= December, 1987.
book/1080/010090.bin.png =EXTRACTED= from the image at the predicted point and performs cross-correlation
book/1083/010056.bin.png =EXTRACTED= backtra_ck_ing as suggested in [12] to avoid outputting false
book/1090/010009.bin.png =EXTRACTED= lengths of the straight regions.
book/1094/010075.bin.png =EXTRACTED= Path based: In [5] and [7] a linear filtering approach
book/1097/010025.bin.png =EXTRACTED= only.
book/1107/010018.bin.png =EXTRACTED= [49] A.G. Hauptmann and B.F. Green. A compa_ri_son of
book/1108/010026.bin.png =EXTRACTED= similarity matching.
book/1109/010059.bin.png =EXTRACTED= the vanishing line of an image of a coplanar structure
book/1113/010043.bin.png =EXTRACTED= and several assistants or advisers. The speaker of
book/1116/010030.bin.png =EXTRACTED= uation results are shown using faces, with obvious
book/1116/010089.bin.png =EXTRACTED= to send the prepared poll to the selected advisers or
book/1117/010049.bin.png =EXTRACTED= A window is open displaying the _fi_rst unread message.
book/1123/010102.bin.png =EXTRACTED= dependent. For three or more views, shape
book/1124/010041.bin.png =EXTRACTED= Figural completion is the preattentive ability of the
book/1125/010013.bin.png =EXTRACTED= unit quate_rn_ions).
book/1126/010078.bin.png =EXTRACTED= able to carry out parallel prefix in all partitions at
book/1127/010102.bin.png =EXTRACTED= recti_fi_ed (unwarped) to present a simulated vertical
book/1130/010061.bin.png =EXTRACTED= and texture information in the model (extracted using
book/1132/010007.bin.png =EXTRACTED= sulting in analytic expressions for the con_fi_dence level of
book/1132/010040.bin.png =EXTRACTED= in the position or orientation of the data features due
book/1132/010098.bin.png =EXTRACTED= (thousands) of very simple image features (short line
book/1133/010066.bin.png =EXTRACTED= for features arising from the background, and is normally
book/1137/010010.bin.png =EXTRACTED= used for a le_as_t-squares solution). Then, for any ad-
book/1137/010023.bin.png =EXTRACTED= to explicitly recover structure, camera transformation or
book/1138/010077.bin.png =EXTRACTED= the image plane to determine rough distance from the
book/1139/010029.bin.png =EXTRACTED= different lighting conditions. The algorithms each em-
book/1139/010077.bin.png =EXTRACTED= tures for each frame. These features are dense, stable,
book/1139/010104.bin.png =EXTRACTED= which determines translational offsets between patches
book/1140/010018.bin.png =EXTRACTED= the camera undergoes pure rotation. This method does
book/1141/010022.bin.png =EXTRACTED= The problem that this chip solves is that of computing
book/1142/010118.bin.png =EXTRACTED= 62
book/1143/010021.bin.png =EXTRACTED= [36] T. Poggio, M. Fahle, and S. Edelman. F_as_t perceptual
book/1147/010030.bin.png =EXTRACTED= In recent years a number of papers have appeared in
book/1150/010008.bin.png =EXTRACTED= *** NO CSEG *** book/1150/010008.cseg.png
book/1152/010056.bin.png =EXTRACTED= Our work to date has been focused on an initial set
book/1152/010058.bin.png =EXTRACTED= industrial site. Figures 1 and 2 are two overlapping
book/1152/010060.bin.png =EXTRACTED= illustrate typical scene content. In order to obtain
book/1155/010069.bin.png =EXTRACTED= produce a few false positives that miss buildings at
book/1158/010015.bin.png =EXTRACTED= hypothesis and assigned a value of 1.6; Otherwise,
book/1158/010035.bin.png =EXTRACTED= accurate delineation. One way to visualize the
book/1158/010047.bin.png =EXTRACTED= 3.2. Open issues
book/1158/010081.bin.png =EXTRACTED= manipulates a variety of models over features in
book/1160/010026.bin.png =EXTRACTED= constraints similar to _th_ose proposed by [Nicolin
book/1161/010021.bin.png =EXTRACTED= Figure 18: User-assisted 3 point verification.
book/1162/010004.bin.png =EXTRACTED= corresponds to the image being overlaid on the
book/1164/010063.bin.png =EXTRACTED= construction method for a large-scale digital
book/1164/010084.bin.png =EXTRACTED= limits to this technique since as the number of tiles
book/1168/010023.bin.png =EXTRACTED= sets, such as those for San Francisco National
book/1169/010033.bin.png =EXTRACTED= tasks in traditional remote sensing it is clear that
book/1169/010057.bin.png =EXTRACTED= differential radial basis function, for surface
book/1169/010100.bin.png =EXTRACTED= differential radial basis function, for surface
book/1170/010108.bin.png =EXTRACTED= Mapping and Spatial Modelling for
book/1171/010046.bin.png =EXTRACTED= Understanding 57(2), March, 1993.
book/1174/010044.bin.png =EXTRACTED= in a completely parallel manner to acquire a frame of
book/1175/010006.bin.png =EXTRACTED= 3.2 Spatio-Geometric and Optical Compu-
book/1177/010046.bin.png =EXTRACTED= noise can be on-chip switching electronics which
book/1180/010028.bin.png =EXTRACTED= run (currently between $50,0_00_ and $80,0_00_) MO-
book/1183/010072.bin.png =EXTRACTED= [22] J. Dominguez, ''E_ff_ortless Internal Camera Calibration
book/1184/010045.bin.png =EXTRACTED= and Solid State Sensors, Santa Clara, CA, pp. 152-161,
book/1185/010064.bin.png =EXTRACTED= Proc. SPIE, Vol.1473, pp. 66-75, 1991.
book/1193/010046.bin.png =EXTRACTED= move a pointing device to the location of the
book/1199/010084.bin.png =EXTRACTED= gaussian is used.
book/1206/010018.bin.png =EXTRACTED= Analysis and Machine Intelligence, 7(4):384-401,
book/1208/010017.bin.png =EXTRACTED= 1. Introduction
book/1208/010032.bin.png =EXTRACTED= called the Navlab 2.
book/1215/010013.bin.png =EXTRACTED= Machine Vision Planning system to function in
book/1216/010035.bin.png =EXTRACTED= These constraints are combined in an optimiza-
book/1219/010024.bin.png =EXTRACTED= 7 Experimental Results
book/1223/010057.bin.png =EXTRACTED= [Tarabanis et al., 1991b]
book/1223/010060.bin.png =EXTRACTED= and modeling for robotic vision tasks. In Pro-
book/1229/010003.bin.png =EXTRACTED= is used to fit pinhole-model parameters to line-of-sight infor-
book/1229/010056.bin.png =EXTRACTED= The second part of the calibration procedure determines the
book/1237/010059.bin.png =EXTRACTED= or terms which are expected to be important with
book/1240/010038.bin.png =EXTRACTED= based on the access patterns to these objects. Fur-
book/1240/010045.bin.png =EXTRACTED= on the dynamic aspects of the access patterns to the
book/1241/010055.bin.png =EXTRACTED= [Joh89] D.S. Johnson, C.R. Aragon, L.A. McGeoch
book/1243/010056.bin.png =EXTRACTED= resulting matrix is fed into a preprocesssor fo
book/1244/010003.bin.png =EXTRACTED= is traced and a binary tree is const
book/1252/010028.bin.png =EXTRACTED= ''Automatic recognition of print and
book/1252/010051.bin.png =EXTRACTED= pp.35-37
book/1253/010027.bin.png =EXTRACTED= ''A Model-Based Computer Vision System
book/1260/010002.bin.png =EXTRACTED= document image. The document is organized
book/1262/010065.bin.png =EXTRACTED= determine dependence and thus perform CEO
book/1263/010018.bin.png =EXTRACTED= for natural language understanding_''_
book/1263/010022.bin.png =EXTRACTED= Engineering (1987) p. 416-422.
book/1269/010064.bin.png =EXTRACTED= A drawback of NN classifiers is the large
book/1271/010003.bin.png =EXTRACTED= category '1' (x < y).
book/1272/010076.bin.png =EXTRACTED= terested in maintaining a high recognition
book/1272/010077.bin.png =EXTRACTED= M. Sabourin, A. Mitiche, D. Thomas, and G. Nagy
book/1276/010002.bin.png =EXTRACTED= imously agree, otherwise the character is rejected. Results for tangents and
book/1282/010021.bin.png =EXTRACTED= text, along with the histogram of MST edge
book/1286/010007.bin.png =EXTRACTED= produced by our page segmentation algo-
book/1286/010051.bin.png =EXTRACTED= page).
book/1295/010009.bin.png =EXTRACTED= (i.e. motion direction and translation distance) de-
book/1305/010024.bin.png =EXTRACTED= burners.'' Further, a lighthouse could identify itself by ex-
book/1307/010074.bin.png =EXTRACTED= through Babbage and Boole right to the end of the cen-
book/1314/010062.bin.png =EXTRACTED= approach is depicted in Fig. 2.
book/1317/010023.bin.png =EXTRACTED= plug-in modules, they can be easily moved from one area
book/1318/010031.bin.png =EXTRACTED= improves the fanout of the monitored signals. The phas
book/1319/010058.bin.png =EXTRACTED= master detects a discrepancy between its count and the
book/1320/010046.bin.png =EXTRACTED= gineering from ISU in 1984. From 1977 to 1982, he was a
book/1323/010053.bin.png =EXTRACTED= segmented into spatially connected surface regions. For each
book/1325/010036.bin.png =EXTRACTED= Unstable This category is not tested by the analysis of the
book/1329/010049.bin.png =EXTRACTED= tal pixels. Therefore, the (Y) or horizontal rangel size is
book/1335/010027.bin.png =EXTRACTED= arrays of processors (_''_cells''). Communication between cells
book/1336/010071.bin.png =EXTRACTED= When we use graph contraction to construct a pyramid, tw
book/1337/010044.bin.png =EXTRACTED= (1) as well.
book/1339/010054.bin.png =EXTRACTED= have different colors. The stochastic decimation algorithm selects
book/1346/010012.bin.png =EXTRACTED= and the EHK orientation, in which the magnetic field was
book/1348/010024.bin.png =EXTRACTED= of the sheets. The two plastics that were used, and their
book/1352/010022.bin.png =EXTRACTED= guages,'' ''microprogram assemblers,'' and ''micropro-
book/1355/010089.bin.png =EXTRACTED= type is used in a particular computer. So, data generation
book/1357/010001.bin.png =EXTRACTED= have been implemented in this way have also been adap-
book/1362/010055.bin.png =EXTRACTED= 1977, pp. 80-83.
book/1362/010116.bin.png =EXTRACTED= applications programmer, and since 19
book/1363/010007.bin.png =EXTRACTED= The goal in computer vision systems is to analyze data collected from the environment
book/1367/010084.bin.png =EXTRACTED= patterns are that less time is spent pro
book/1371/010031.bin.png =EXTRACTED= minimize such effects. This may be done
book/1375/010007.bin.png =EXTRACTED= ency lists.
book/1376/010006.bin.png =EXTRACTED= codes, in conjunction with a _''_systolic'' cellular array
book/1377/010039.bin.png =EXTRACTED= tation of the bimodal memory requires a memory
book/1380/010003.bin.png =EXTRACTED= of the shortest line segment that includes all of the
book/1380/010086.bin.png =EXTRACTED= be handled). Actually, the algorithm reserves region
book/1386/010032.bin.png =EXTRACTED= complete traversability map of the entire sensed area is not
book/1387/010035.bin.png =EXTRACTED= based planner providing route information obtained from
book/1387/010064.bin.png =EXTRACTED= to travel toward a goal when the vehicle was in a clear
book/1389/010003.bin.png =EXTRACTED= 2.2 Approach
book/1392/010041.bin.png =EXTRACTED= has failed since the local nature of these methods assumes that the
book/1394/010045.bin.png =EXTRACTED= *** maxseg AND aligned lengths DIFFER*** 49 48
book/1394/010089.bin.png =EXTRACTED= model. When we create the new tempora_ry_ model
book/1395/010044.bin.png =EXTRACTED= tra_ck_er is identical. Thus an edge tra_ck_er path histo_ry_ can be
book/1398/010015.bin.png =EXTRACTED= immediately curving right).
book/1402/010020.bin.png =EXTRACTED= tions concerning the specific strengths and deficiencies
book/1406/010041.bin.png =EXTRACTED= standard deviation is less than 1%. An interesting
book/1409/010004.bin.png =EXTRACTED= SUNY at Bu_ff_alo
book/1415/010004.bin.png =EXTRACTED= by image analysis systems using this model _as_ the
book/1417/010035.bin.png =EXTRACTED= *** maxseg AND aligned lengths DIFFER*** 50 49
book/1418/010025.bin.png =EXTRACTED= Meeting, pages 931-935, 1986.
book/1418/010045.bin.png =EXTRACTED= In Proceedings of the Human Factors Society 26th
book/1418/010052.bin.png =EXTRACTED= [57] L.L. Leber, C.D. Wickens, C. Bakke, M. Sulek, and
book/1420/010013.bin.png =EXTRACTED= projection of a frontal plane is therefore d
book/1420/010059.bin.png =EXTRACTED= the vanishing line of an image of a coplanar structure
book/1422/010016.bin.png =EXTRACTED= two main vanishing points in the image, and from the
book/1423/010042.bin.png =EXTRACTED= ever, to the extent that some scene features are copla
book/1425/010035.bin.png =EXTRACTED= negotiators. Each _as_sistant is provided with a work-
book/1430/010025.bin.png =EXTRACTED= 2. Fish, R., Kraut, R., Leland, M.: _''_Quilt: a col-
book/1432/010005.bin.png =EXTRACTED= pose estimates) in a fixed-length first-in last-out
book/1435/010035.bin.png =EXTRACTED= Experimentation with these algorithms on
book/1437/010049.bin.png =EXTRACTED= algorithm for hierarchical geometric edge
book/1442/010117.bin.png =EXTRACTED= However, using view-b_as_ed representations only solves
book/1443/010072.bin.png =EXTRACTED= Second, we need to model the probability of an inter-
book/1443/010083.bin.png =EXTRACTED= multi-resolution methods), in which we use coarse data
book/1447/010091.bin.png =EXTRACTED= tions that are specific for the object cl_as_s corresponding
book/1449/010014.bin.png =EXTRACTED= technique can be used to compute qualitative properties
book/1449/010051.bin.png =EXTRACTED= for features of a particular size _as_ well _as_ an edge locator.
book/1449/010075.bin.png =EXTRACTED= scale-space for the region decomposition. The result of
book/1452/010022.bin.png =EXTRACTED= with the brightness of objects in the scene. Or the ap-
book/1453/010018.bin.png =EXTRACTED= sual modules from examples: a framework for under-
book/1458/010008.bin.png =EXTRACTED= signed to generate a straight line if the driving noise
book/1459/010040.bin.png =EXTRACTED= This is done by using shifting windows, _as_ illustrated
book/1462/010070.bin.png =EXTRACTED= views.
book/1466/010046.bin.png =EXTRACTED= buildings using stereo analysis together with
book/1469/010006.bin.png =EXTRACTED= their relationship to adjacent regions. According to
book/1469/010008.bin.png =EXTRACTED= the region is less than the lowest of its neighbors,
book/1471/010085.bin.png =EXTRACTED= input at other phases in the extraction algorithms,
book/1473/010041.bin.png =EXTRACTED= process.
book/1477/010049.bin.png =EXTRACTED= 6.1.1. Intermediate result evaluation
book/1477/010082.bin.png =EXTRACTED= that the constraint does not support a pair of
book/1478/010047.bin.png =EXTRACTED= compiled statistics on run-time, number of
book/1480/010084.bin.png =EXTRACTED= resolution panchromatic imagery has been
book/1482/010042.bin.png =EXTRACTED= Jefferey A. Shufelt and David M. McKeown.
book/1483/010002.bin.png =EXTRACTED= A Report from the DARPA Workshop
book/1488/010042.bin.png =EXTRACTED= lar application, but several general remarks are in
book/1489/010089.bin.png =EXTRACTED= possibility is optical signal communication between
book/1489/010091.bin.png =EXTRACTED= 341
book/1490/010065.bin.png =EXTRACTED= design rarely uses minimum size transistors, but is
book/1492/010014.bin.png =EXTRACTED= ing with hardware tends to extend time in graduate
book/1502/010044.bin.png =EXTRACTED= variable for the alignment task and does not
book/1502/010057.bin.png =EXTRACTED= object will trace out a conic section, an el-
book/1507/010042.bin.png =EXTRACTED= [1] D. Bennett, J. Hollerbach, and
book/1508/010033.bin.png =EXTRACTED= [17] B. H. Yoshimi and P. K. Allen
book/1519/010046.bin.png =EXTRACTED= to learn features which would allow the system
book/1520/010021.bin.png =EXTRACTED= achieve better performance than a single
book/1522/010058.bin.png =EXTRACTED= To determine more quantitative results, image/
book/1522/010065.bin.png =EXTRACTED= Also, a MANIAC network integrating the
book/1523/010039.bin.png =EXTRACTED= A central idea that this research is t_ry_ing to
book/1524/010099.bin.png =EXTRACTED= cate their resources to match a given problem,
book/1526/010034.bin.png =EXTRACTED= sic problem is that in setting up an automated
book/1526/010049.bin.png =EXTRACTED= provide a robust view of specific features so that
book/1528/010010.bin.png =EXTRACTED= sensor planning problem can be solved trivially.
book/1531/010022.bin.png =EXTRACTED= spaces appear to be unpractical. This is one of
book/1534/010061.bin.png =EXTRACTED= ceedings 1991 IEEE International Conference
book/1539/010002.bin.png =EXTRACTED= images. Two sets of calibration parameters must be me_as_ure
book/1544/010022.bin.png =EXTRACTED= theory and practice, demonstrating the impact that the sm
book/1546/010031.bin.png =EXTRACTED= not based solely on the inter-structure of the persis-
book/1547/010006.bin.png =EXTRACTED= In Section 2 we present an overview of the Self-
book/1547/010052.bin.png =EXTRACTED= the clustering problem.
book/1548/010046.bin.png =EXTRACTED= the accesses to that store as a hypergraph, since the
book/1549/010041.bin.png =EXTRACTED= The following is an example which illustrates the
book/1552/010017.bin.png =EXTRACTED= ping clustering algorithm similar to our proposed al-
book/1554/010025.bin.png =EXTRACTED= subimages.
book/1557/010034.bin.png =EXTRACTED= 4.2 Structural Information
book/1568/010075.bin.png =EXTRACTED= (Step 1)
book/1569/010076.bin.png =EXTRACTED= complete, the zones are processed through the
book/1583/010065.bin.png =EXTRACTED= than does the CNN rule, and it preserves
book/1590/010011.bin.png =EXTRACTED= izontal, some East Asian writing systems
book/1591/010070.bin.png =EXTRACTED= 2. idealize the remaining components as
book/1599/010019.bin.png =EXTRACTED= of Arabic and Nepali (written using the De
+ true
+ true compute a tree vector quantizer for the characters
+ true in book.h5
+ true
+ ocropus-tsplit -d book.h5 -o book.tsplit --maxsplit 100
loading dataset
got 40790 samples out of 40790
# classes 94
most common ~ 21245 / e 2223 / t 1593 / a 1426 / i 1377 / o 1291 / n 1275 / s 1217 / r 1150 / h 742 / ...
starting training
 pcakmeans 40790 k 81 d 0.95
 predicting 40790 1024
writing
+ true
+ true compute terminal classifiers for each VQ bucket
+ true this results in a character model that can be used for recognition
+ true
+ ocropus-tleaves -d book.h5 -s book.tsplit -o book.cmodel
loading splitter
got <ocrolib.patrec.HierarchicalSplitter instance at 0x28244d0>
#splits 81
excluding [ _\000-\037]
sizemode linerel
loading dataset
sizemode (data) linerel
splitting
0
10000
20000
30000
40000
cluster    0 len    697    . 289 / , 237 / - 128 / ~ 37 / s 2
cluster    1 len    476    ~ 395 / ' 42 / i 11 / - 11 / l 7
cluster    2 len    357    f 310 / ~ 45 / r 1 / T 1
cluster    3 len    314    ~ 299 / f 10 / t 2 / h 1 / th 1
cluster    4 len    559    ~ 243 / l 194 / 1 50 / I 34 / i 15
cluster    5 len   1016    ~ 540 / l 422 / 1 35 / f 4 / I 4
cluster    6 len   1083    ~ 1051 / i 11 / u 7 / n 4 / h 2
cluster    7 len    857    ~ 850 / i 4 / b 2 / h 1
cluster    8 len    983    ~ 960 / i 19 / n 2 / 1 1 / m 1
cluster    9 len    467    ~ 429 / i 12 / r 6 / : 6 / , 5
cluster   10 len    713    i 706 / ~ 3 / j 2 / ri 2
cluster   11 len    471    i 461 / l 5 / ~ 5
cluster   12 len    372    ~ 122 / l 51 / i 44 / ) 42 / t 23
cluster   13 len   1046    t 1033 / ~ 9 / i 2 / l 1 / th 1
cluster   14 len    290    t 254 / i 17 / ~ 14 / L 4 / l 1
cluster   15 len    382    ~ 216 / i 52 / t 48 / l 31 / f 15
cluster   16 len    951    ~ 922 / i 13 / t 7 / I 4 / n 2
cluster   17 len    642    ~ 529 / r 92 / t 8 / i 4 / n 4
cluster   18 len    353    r 271 / ~ 74 / c 2 / t 2 / m 1
cluster   19 len    994    r 763 / ~ 221 / T 5 / n 2 / o 1
cluster   20 len    166    T 72 / ~ 42 / F 32 / f 7 / 7 6
cluster   21 len    309    t 200 / ~ 60 / ( 39 / l 5 / T 4
cluster   22 len     92    ~ 90 / y 2
cluster   23 len    204    y 195 / ~ 4 / ry 2 / 9 1 / Y 1
cluster   24 len    358    v 185 / ~ 134 / V 16 / y 14 / 9 3
cluster   25 len    633    h 590 / ~ 38 / b 5
cluster   26 len    350    b 210 / h 115 / ~ 24 / D 1
cluster   27 len    252    ~ 91 / k 68 / h 28 / E 19 / L 17
cluster   28 len    345    g 340 / ~ 5
cluster   29 len    234    S 69 / 2 63 / 8 32 / 5 28 / 3 16
cluster   30 len    367    s 360 / ~ 4 / S 2 / e 1
cluster   31 len    413    s 408 / ~ 3 / 5 1 / S 1
cluster   32 len    372    s 366 / ~ 3 / S 2 / 8 1
cluster   33 len    206    ~ 123 / 4 20 / 7 18 / 3 17 / d 7
cluster   34 len    138    x 52 / ~ 30 / z 27 / s 17 / A 5
cluster   35 len    586    ~ 577 / c 2 / r 2 / t 2 / E 1
cluster   36 len    680    ~ 662 / s 6 / N 3 / y 3 / c 2
cluster   37 len    549    ~ 545 / d 3 / M 1
cluster   38 len    570    ~ 567 / M 1 / e 1 / o 1
cluster   39 len    591    ~ 577 / o 7 / m 4 / a 2 / ck 1
cluster   40 len    458    ~ 457 / o 1
cluster   41 len    494    ~ 493 / ry 1
cluster   42 len    500    ~ 442 / m 56 / o 2
cluster   43 len    486    ~ 486
cluster   44 len    347    ~ 330 / W 11 / 0 2 / g 1 / ry 1
cluster   45 len    931    ~ 565 / m 364 / rn 1 / r 1
cluster   46 len    687    ~ 663 / m 17 / a 2 / e 2 / c 1
cluster   47 len    508    ~ 506 / s 1 / m 1
cluster   48 len     65    A 63 / ~ 2
cluster   49 len    798    a 793 / s 2 / ~ 2 / g 1
cluster   50 len    330    a 207 / s 42 / ~ 23 / e 21 / n 20
cluster   51 len    367    a 347 / ~ 16 / n 2 / o 2
cluster   52 len    609    c 516 / e 67 / ~ 18 / t 3 / C 2
cluster   53 len   1013    e 1006 / c 5 / ~ 2
cluster   54 len    377    e 242 / c 75 / ~ 47 / P 3 / v 3
cluster   55 len    437    e 423 / ~ 11 / a 1 / c 1 / o 1
cluster   56 len    467    e 443 / c 18 / ~ 4 / t 2
cluster   57 len    630    o 601 / O 11 / c 4 / 0 3 / e 2
cluster   58 len    626    o 615 / ~ 8 / c 2 / O 1
cluster   59 len    445    p 413 / P 22 / ~ 7 / F 1 / n 1
cluster   60 len    529    u 431 / 9 40 / q 26 / ~ 15 / g 10
cluster   61 len    187    w 171 / ~ 14 / W 2
cluster   62 len    814    d 574 / ~ 238 / a 1 / rt 1
cluster   63 len    323    ~ 115 / a 71 / n 56 / o 48 / N 13
cluster   64 len    583    ~ 570 / t 5 / B 2 / r 2 / H 1
cluster   65 len    352    ~ 345 / w 2 / th 1 / n 1 / u 1
cluster   66 len    511    ~ 489 / O 10 / M 3 / m 2 / rt 1
cluster   67 len    673    n 422 / ~ 243 / R 3 / d 1 / h 1
cluster   68 len    847    n 558 / ~ 286 / m 2 / e 1
cluster   69 len    414    ~ 206 / n 197 / D 5 / m 2 / U 2
cluster   70 len    250    ~ 250
cluster   71 len    419    ~ 410 / b 4 / e 4 / h 1
cluster   72 len    254    ~ 252 / p 2
cluster   73 len    301    ~ 298 / Th 2 / o 1
cluster   74 len    593    ~ 588 / d 3 / e 2
cluster   75 len    762    ~ 729 / as 11 / c 6 / o 5 / s 5
cluster   76 len    673    ~ 670 / i 1 / oo 1 / 0 1
cluster   77 len    423    ~ 423
cluster   78 len    327    ~ 187 / C 53 / 6 23 / E 22 / 0 20
cluster   79 len    214    ~ 59 / N 31 / R 30 / B 24 / H 21
cluster   80 len    328    ~ 227 / M 43 / O 14 / G 12 / D 8
writing
+ true
+ true compute the per-character error rate from the classifier
+ true note that you should really do this with separate training/test
+ true sets and the -t option is convenient for that
+ ocropus-db predict -m book.cmodel book.h5
420 40790 1.02966413337
+ true
+ true use the new character model for recognition of course, this
+ true will be worse than the original model, since we 'didn'\''t' use
+ true a lot of characters for training
+ true
+ ocropus-lattices 'book/00??/??????.png' -m book.cmodel

Traceback (most recent call last):
  File "/usr/local/bin/ocropus-lattices", line 56, in <module>
    args.files = ocrolib.glob_all(args.files)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/toplevel.py", line 204, in argument_checks
    result = f(*args,**kw)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 509, in glob_all
    raise Exception("%s: expansion did not yield any files"%arg)
Exception: book/00??/??????.png: expansion did not yield any files
dell@ubuntu:~/ocropus/uw3-500$
 Where i made mistake. I also attached copy of run-uw3-500 for scrutiny,

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/hNrHa36jj_AJ.
run-uw3-500

Tom

unread,
Aug 23, 2012, 12:06:23 PM8/23/12
to ocr...@googlegroups.com
Please see my previous response: there was an error in those scripts.  Just update the repository (cd ocropus; hg pull; hg update) and try again.

Tom


On Thursday, August 23, 2012 5:41:10 PM UTC+2, 79yrsold wrote:
I rerun corrected run-uw3-500 - error displayed as follows;
book/1051/010039.bin.png =EXTRACTED= 25. E. Skordalakis, ''DGCAS as a Microprogram Develop-
book/1059/010098.bin.png =EXTRACTED= true for cases where all coordinates have
book/1061/010055.bin.png =EXTRACTED= borhood image transformations, lose much of their
book/1064/010095.bin.png =EXTRACTED= histogram. The cumulative histogram contains, in
book/1067/010058.bin.png =EXTRACTED= processors.
book/1069/010008.bin.png =EXTRACTED= 4.2. Hough transformation using the bimodal
book/1069/010107.bin.png =EXTRACTED= the adjacency list for region k. It has been con-
book/1070/010027.bin.png =EXTRACTED= 5.2. Architectural tradeoffs
... 

Sriranga(78yrsold)

unread,
Aug 23, 2012, 2:18:54 PM8/23/12
to ocr...@googlegroups.com
Tom,
Followed your valuable guidance. re-run uw3-500's two scripts - vide attached files - which is self explanatory. It is presumed
 that everything are OK according your expectation - kindly confirm.
With warmest regards,
-sriranga(79yrs)

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/uj27Rk6bicgJ.
typescriptrunuw-3-500 parallel
extract of terminal run-uw3-500

Sriranga(78yrsold)

unread,
Aug 24, 2012, 10:36:02 AM8/24/12
to ocr...@googlegroups.com
re-run script Fraktur-boxes. Error displayed as follow:

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_1.tif deu-f/fontfile_1.png
./run-box-training: line 23: convert: command not found
dell@ubuntu:~/ocropus /fraktur-boxes$

As per script :
for image in deu-f/*.tif; do
    convert -depth 8 $image ${image%%.*}.png
done

Where I made mistake?

-sriranga(79yrs)


On Thu, Aug 23, 2012 at 9:36 PM, Tom <tmb...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/uj27Rk6bicgJ.

Brad Hards

unread,
Aug 24, 2012, 8:15:06 PM8/24/12
to ocr...@googlegroups.com
On Saturday 25 August 2012 00:36:02 Sriranga(78yrsold) wrote:
> re-run script Fraktur-boxes. Error displayed as follow:
>
> + for image in 'deu-f/*.tif'
> + convert -depth 8 deu-f/fontfile_1.tif deu-f/fontfile_1.png
> ./run-box-training: line 23: convert:* command not found*
It looks like you don't have the "convert" executable, so when the script
tries to run it, it doesn't work. "convert" comes from imagemagick package (or
something similar, like graphicsmagick with compatibility wrappers).

On ubuntu, I'd suggest installing:
graphicsmagick
and
graphicsmagick-imagemagick-compat
packages

Let us know if that helps at all.

Brad

Sriranga(78yrsold)

unread,
Aug 24, 2012, 10:39:05 PM8/24/12
to ocr...@googlegroups.com
 Brad,
Extremely thankful to you for your valuable guidance. Succeeded to convert from .tif to .png - vide reproduced the extract of terminal
below:
With Warmest Regards,
-sriranga(79yrs)
dell@ubuntu:~/ocropus_6.0/fraktur-boxes$ ./run-box-training
================================================================
=== This script illustrates training of a simple, initial
=== character recognizer from the kind of boxdata training
=== files used with Tesseract.
================================================================


+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_1.tif deu-f/fontfile_1.png
+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_2.tif deu-f/fontfile_2.png

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_3.tif deu-f/fontfile_3.png

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_4.tif deu-f/fontfile_4.png

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_5.tif deu-f/fontfile_5.png

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_6.tif deu-f/fontfile_6.png

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_7.tif deu-f/fontfile_7.png

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/fontfile_8.tif deu-f/fontfile_8.png
+ ocropus-db tess2h5 deu-f/fontfile_1.png deu-f/fontfile_2.png deu-f/fontfile_3.png deu-f/fontfile_4.png deu-f/fontfile_5.png deu-f/fontfile_6.png deu-f/fontfile_7.png deu-f/fontfile_8.png -o boxdata.h5
['deu-f/fontfile_1.png', 'deu-f/fontfile_2.png', 'deu-f/fontfile_3.png', 'deu-f/fontfile_4.png', 'deu-f/fontfile_5.png', 'deu-f/fontfile_6.png', 'deu-f/fontfile_7.png', 'deu-f/fontfile_8.png']
=== deu-f/fontfile_1.png
=== deu-f/fontfile_2.png
=== deu-f/fontfile_3.png
=== deu-f/fontfile_4.png
deu-f/fontfile_4.box : 2734 : syntax error
    e 257 112 268 131P

deu-f/fontfile_4.box : 2807 : syntax error
    e 609 75 621 94g

=== deu-f/fontfile_5.png
deu-f/fontfile_5.box : 1102 : bad box dimensions
    si 750 3198 1783 3246

=== deu-f/fontfile_6.png
=== deu-f/fontfile_7.png
=== deu-f/fontfile_8.png
+ ocropus-tsplit --pca 0.95 --vq 80 -d boxdata.h5 -o boxdata.split
loading dataset
got 11959 samples out of 11959
# classes 121
most common e 1827 / n 1051 / i 808 / r 769 / a 600 / t 587 / s 508 / d 492 / u 415 / l 386 / ...
starting training
 pcakmeans 11959 k 80 d 0.95
 predicting 11959 1024
writing
+ ocropus-tleaves -Q 4 -s boxdata.split -d boxdata.h5 -o boxdata.cmodel
loading splitter
got <ocrolib.patrec.HierarchicalSplitter instance at 0x2c704d0>
#splits 80
excluding [ _\000-\037]
sizemode perchar
loading dataset
sizemode (data) perchar
splitting
0
10000
cluster    0 len    180    d 176 / h 3 / c 1
cluster    5 len    111    b 104 / h 5 / d 1 / ö 1
cluster   10 len    197    n 196 / N 1
cluster   15 len    209    o 178 / v 12 / 0 9 / e 2 / V 2
cluster    6 len    145    ä 50 / ü 39 / ö 37 / 5 4 / 4 4
cluster   11 len    171    n 169 / h 1 / t 1
cluster    1 len    103    b 61 / h 41 / d 1
cluster   16 len    179    p 82 / v 55 / P 11 / D 7 / B 5
cluster   12 len     30    M 17 / W 10 / wi 1 / ap 1 / es 1
cluster    2 len     63    h 61 / H 2
cluster    7 len    178    g 176 / q 1 / 9 1
cluster   17 len    208    s 118 / J 18 / I 14 / H 13 / Z 13
cluster   13 len    227    a 227
cluster    3 len    101    d 100 / 6 1
cluster    8 len    178    ß 40 / ü 37 / st 21 / K 13 / si 11
cluster   18 len     89    o 87 / 0 1 / O 1
cluster   14 len    366    a 359 / A 5 / d 1 / O 1
cluster    4 len    209    d 208 / b 1
cluster    9 len    165    h 127 / ß 12 / tz 10 / y 5 / ö 3
cluster   19 len     67    D 30 / v 14 / O 8 / V 6 / H 3
cluster   20 len     35    A 35
cluster   25 len    147    n 147
cluster   30 len    192    n 176 / u 9 / A 3 / tt 2 / a 1
cluster   35 len    111    H 14 / q 14 / F 12 / C 12 / 6 12
cluster   21 len     69    B 14 / K 11 / * 6 / s 6 / N 4
cluster   26 len     87    n 80 / tz 6 / g 1
cluster   31 len    107    n 80 / R 14 / u 5 / K 4 / U 2
cluster   36 len     46    ? 20 / 7 14 / 2 9 / L 1 / w 1
cluster   27 len    158    u 148 / U 7 / a 2 / n 1
cluster   22 len    231    . 219 / « 2 / * 2 / a 1 / e 1
cluster   32 len    188    n 185 / y 1 / N 1 / u 1
cluster   37 len    141    T 31 / ck 30 / D 27 / E 12 / C 10
cluster   28 len    129    u 115 / U 9 / a 1 / h 1 / n 1
cluster   33 len    103    g 103
cluster   38 len    150    z 68 / F 20 / ’ 19 / J 11 / 4 9
cluster   23 len    114    s 81 / 8 5 / g 3 / « 3 / - 3
cluster   29 len    138    u 131 / n 4 / h 1 / U 1 / ü 1
cluster   34 len     86    Q 30 / » 21 / « 13 / O 8 / N 5
cluster   39 len    330    t 328 / i 2
cluster   40 len    127    l 66 / t 32 / i 11 / : 7 / 1 3
cluster   24 len     88    st 42 / si 25 / ll 9 / a 7 / K 1
cluster   45 len     52    — 35 / = 5 / ~ 3 / - 2 / tm 1
cluster   55 len     63    S 37 / G 18 / E 6 / s 1 / Ö 1
cluster   50 len    229    i 223 / j 2 / l 2 / s 1 / t 1
cluster   41 len    261    , 255 / y 3 / e 2 / - 1
cluster   46 len    102    ) 27 / ; 25 / : 21 / x 11 / - 5
cluster   56 len     84    G 37 / E 21 / S 18 / * 3 / O 3
cluster   51 len     78    ch 78
cluster   42 len    130    k 65 / f 55 / s 5 / 5 2 / b 1
cluster   47 len    205    l 132 / 1 30 / I 10 / ! 7 / i 6
cluster   57 len     63    B 15 / P 13 / V 13 / N 9 / R 8
cluster   52 len     59    ch 59
cluster   43 len    120    s 94 / f 10 / H 6 / L 3 / l 3
cluster   48 len    240    l 182 / ( 21 / ! 14 / i 10 / k 4
cluster   58 len    178    ch 175 / Ö 2 / f 1
cluster   49 len    199    i 193 / t 5 / 4 1
cluster   53 len     96    w 92 / W 4
cluster   44 len    264    s 185 / f 67 / k 5 / e 3 / i 2
cluster   60 len    160    m 138 / M 14 / n 3 / ch 1 / o 1
cluster   59 len    121    m 109 / M 7 / ru 1 / en 1 / la 1
cluster   54 len     72    w 55 / W 12 / sp 2 / tz 1 / m 1
cluster   65 len    103    i 75 / j 15 / t 5 / f 3 / s 3
cluster   61 len     81    i 81
cluster   70 len    344    e 337 / L 5 / 9 1 / c 1
cluster   75 len    108    t 88 / k 11 / e 8 / m 1
cluster   66 len     98    r 93 / x 4 / D 1
cluster   62 len     57    i 56 / z 1
cluster   71 len    180    e 179 / c 1
cluster   67 len    113    r 112 / y 1
cluster   76 len    118    t 116 / : 1 / r 1
cluster   63 len     61    i 61
cluster   77 len    134    r 132 / Y 1 / t 1
cluster   72 len    488    e 461 / c 15 / L 11 / s 1
cluster   68 len    187    r 187
cluster   64 len     87    i 86 / t 1
cluster   78 len    140    r 140
cluster   69 len    217    e 217
cluster   73 len    341    e 340 / h 1
cluster   79 len    101    r 100 / y 1
cluster   74 len    272    e 272
writing
+ ocropus-db predict -m boxdata.cmodel boxdata.h5
19 11959 0.158876160214
+ convert deu-f/fontfile_2.tif page.bin.png
+ ocropus-gpageseg page.bin.png
page.bin.png
computing segmentation
computing column separators
computing lines
propagating labels
spreading labels
number of lines 27
finding reading order
writing lines
    26 page.bin.png 41.3 27
+ ocropus-lattices -m boxdata.cmodel page/010001.bin.png page/010002.bin.png page/010003.bin.png page/010004.bin.png page/010005.bin.png page/010006.bin.png page/010007.bin.png page/010008.bin.png page/010009.bin.png page/01000a.bin.png page/01000b.bin.png page/01000c.bin.png page/01000d.bin.png page/01000e.bin.png page/01000f.bin.png page/010010.bin.png page/010011.bin.png page/010012.bin.png page/010013.bin.png page/010014.bin.png page/010015.bin.png page/010016.bin.png page/010017.bin.png page/010018.bin.png page/010019.bin.png page/01001a.bin.png page/01001b.bin.png
loading boxdata.cmodel
got <ocrolib.patrec.LocalCmodel instance at 0x3a1bea8>
sizemode perchar
loading /usr/local/share/ocropus/en-space.model
got <ocrolib.wmodel.WhitespaceModel instance at 0x3a31518>
loading /usr/local/share/ocropus/en-mixed.lineest
got <ocrolib.lineest.TrainedLineGeometry instance at 0x3a315a8>
segmenter lineseg.DPSegmentLine()
got <ocrolib.lineseg.DPSegmentLine instance at 0x3a31680>
recognizing 27 files
page/010001.bin.png =RAW= Jizasser rinlien aus einer Quelle. Er trinkt, er wird frisch
page/010002.bin.png =RAW= F)jste nach allen ~eiten, und die Äste gehen wieder ili so
page/010003.bin.png =RAW= viele, viele kleine Zweige, aber alles endet in Pyraiiiideii-
page/010004.bin.png =RAW= ist mein größtes 2;ergniigen (~Freude).
page/010005.bin.png =RAW= ich war früherin New York oft sel)r nervös. Seitdem ich
page/010006.bin.png =RAW= Columbus, so ging es Galilei, so ging es Johann Guten-
page/010007.bin.png =RAW= laufeli, sä)wingen, fechten, boxen uiid tanzeli; kurz,
page/010008.bin.png =RAW= halle llnd habe ())ymnasiik. O, wie ist das sc)ön, i1neili
page/010009.bin.png =RAW= dem Tische sah ich Blumen, Blumensträuße (~Bou-
page/01000a.bin.png =RAW= quets), Niedaillons, Früchte und noch viele, viele andere
page/01000b.bin.png =RAW= soll ich beginnen? wo enden? ljber einen Tanz muß ich
page/01000c.bin.png =RAW= bin. Vor einer halben (~tunde kam ich aus seinen=ßause.
page/01000d.bin.png =RAW= Da war große Gesellschast. Viele interessante Personen
page/01000e.bin.png =RAW= 1nich gerettet aus deii Händen der Räuber, du hast mich
page/01000f.bin.png =RAW= Die kommen von Sn)raklls, llnd erhörte sie sagen: ,,~etzt
page/010010.bin.png =RAW= böse Frau, sie hieß (~ihr Name war) Xantippe.
page/010011.bin.png =RAW= Bella: Ach, das 2i;ort X antippe habe ich ost
page/010012.bin.png =RAW= Donnerwetter muß ein Regen kommen,i, und ging (ich
page/010013.bin.png =RAW= der PhiIosoph auf und ging aus dem Hause. Dieses
page/010014.bin.png =RAW= machte Xantippe sehr böse. ~ie nahm eine Kanne mit
page/010015.bin.png =RAW= Netzt’ ihm den nackten Fuß;
page/010016.bin.png =RAW= von tHeine, von Goethe, von ~chiller, von Riickert.
page/010017.bin.png =RAW= werde selbst siir mich sprechen. Und ich sage: Der
page/010018.bin.png =RAW= *** FAILED (no bestpath) ***
page/010019.bin.png =RAW= ser, als de=Herbst. Ich weiß, der Ci;inter l)at TheateL
page/01001a.bin.png =RAW= Konzert und Ball. Das ist sehr schön, o, ja! und schön
page/01001b.bin.png =RAW= Liszt setzte den Kranz auf den Kopf des glücklichen Niaw
+ set +x

================================================================
=== You now have a simple Fraktur model, boxdata.cmodel.
===
=== This is only an initial model.  It isn't using any baseline
=== information.  The next training step consists of retraining
=== the model by aligning text lines with ground truth (see the
=== example in uw3-500).
===
=== In addition, you probably should construct a language model.
=== You can do that with ocropus-ngraphs.
================================================================
                               [ end ]
--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.

Sriranga(78yrsold)

unread,
Aug 24, 2012, 10:44:50 PM8/24/12
to ocr...@googlegroups.com
Brad,
forgot to add that  regarding "On ubuntu, I'd suggest installing:
graphicsmagick
and
graphicsmagick-imagemagick-
compat
packages "
succeeded to install above softwares using  "ubuntu Software centre".
thanks for all help.
-sriranga(79yrs)

Sriranga(78yrsold)

unread,
Aug 25, 2012, 4:56:55 AM8/25/12
to ocr...@googlegroups.com
hi all,
If run script "./run-box-training" - It is noticed that script generated two folders viz. (1) boxdata.h5 and (2)boxdata.split (3) boxdata.cmodel
Out of curiosity, I like to view the contents the said generated three folders - Is it possible and if so,how to do?
thanks in advance.
With regards,
-sriranga(79yrs)

Tom

unread,
Aug 25, 2012, 7:15:55 AM8/25/12
to ocr...@googlegroups.com
Yes, thanks.

Note that "imagemagick" gets installed according to the install instructions on the web site.   I've also updated the README file in the ocropy directory.

Tom

Sriranga(78yrsold)

unread,
Aug 25, 2012, 3:11:10 PM8/25/12
to ocr...@googlegroups.com
Hi,
tested using only  six fraktur - fontline.tif,box, txt and succeeded  without any error. 
ocropus now supported the utf-8. 

tested - successfully generated kannada(utf-8) training but with few errors - incomplete training -
seeking help how to overcome errors displayed - marked in blue color? Where i made mistake.
extract of terminal is reproduced below:

dell@ubuntu:~$ cd ocropus_6.0/
dell@ubuntu:~/ocropus_6.0$ cd kannada-boxes/
dell@ubuntu:~/ocropus_6.0/kannada-boxes$ ./run-box-training
================================================================
=== This script illustrates training of a simple, initial
=== character recognizer from the kind of boxdata training
=== files used with Tesseract.
================================================================

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/alamandidoddavva.tif deu-f/alamandidoddavva.png

+ for image in 'deu-f/*.tif'
+ convert -depth 8 deu-f/test4.tif deu-f/test4.png
+ ocropus-db tess2h5 deu-f/alamandidoddavva.png deu-f/test4.png -o boxdata.h5
['deu-f/alamandidoddavva.png', 'deu-f/test4.png']
=== deu-f/alamandidoddavva.png
=== deu-f/test4.png

+ ocropus-tsplit --pca 0.95 --vq 80 -d boxdata.h5 -o boxdata.split
loading dataset
got 8263 samples out of 8263
# classes 59
most common ~ 4874 / ದ 288 / . 263 / ವ 256 / ರ 197 / ಂ 193 / ನ 178 / ಯ 177 / ಗ 154 / ಕ 142 / ...
starting training
 pcakmeans 8263 k 80 d 0.95
 predicting 8263 1024

writing
+ ocropus-tleaves -Q 4 -s boxdata.split -d boxdata.h5 -o boxdata.cmodel
loading splitter
got <ocrolib.patrec.HierarchicalSplitter instance at 0x3eb8488>

#splits 80
excluding [ _\000-\037]
sizemode perchar
loading dataset
sizemode (data) perchar
splitting
0
cluster    0 len     41    ~ 41
cluster    5 len     66    ಜ 66
cluster   15 len    208    ~ 199 / ಈ 9
cluster   10 len    236    ~ 155 / ಮ 81
cluster    1 len      2    ~ 2
Process PoolWorker-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 99, in worker
    put((job, i, result))
  File "/usr/lib/python2.7/multiprocessing/queues.py", line 390, in put
    return send(obj)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

cluster   20 len     58    ತ 58
cluster    6 len     77    ಇ 77
cluster   11 len     74    ~ 74
cluster   16 len     20    ~ 20
cluster   21 len     48    ತ 48
cluster    7 len     21    ~ 21
cluster   17 len    168    ~ 168
cluster   12 len    184    ಯ 177 / ~ 7
cluster   22 len     56    . 56
cluster    8 len    258    ಬ 94 / ಲ 65 / ಣ 31 / ಒ 27 / ಟ 20
cluster   13 len    162    ~ 151 / ಈ 11
cluster   18 len    115    ಕ 58 / ~ 35 / ಶ 22
cluster   23 len    121    . 121
cluster    9 len     78    ~ 78
cluster   14 len     50    ~ 50
cluster   19 len    105    ~ 62 / ಹ 43
cluster   24 len     86    . 86
cluster   25 len    156    ಆ 135 / ಅ 21
cluster   30 len    416    ಂ 193 / ರ 182 / ~ 40 / ೦ 1
cluster   35 len    316    ವ 256 / ಪ 41 / ಷ 8 / ಧ 6 / ಥ 2
cluster   26 len    179    ಅ 95 / ಕ 84
cluster   40 len     53    ~ 53
cluster   31 len    262    ನ 178 / ಸ 84
cluster   36 len     76    ಡ 74 / ಢ 2
cluster   27 len     93    ~ 93
cluster   41 len     63    ~ 63
cluster   32 len    239    ಗ 154 / ~ 85
cluster   37 len    303    ದ 288 / ರ 15
cluster   42 len     36    ~ 36
cluster   28 len    222    ~ 222
cluster   33 len     79    ~ 79
cluster   38 len    246    ~ 246
cluster   43 len     52    ~ 52
cluster   29 len    113    ~ 70 / ಭ 24 / ಚ 18 / ೧ 1
cluster   34 len    231    ~ 231
cluster   39 len     28    ~ 28
cluster   44 len    102    ~ 78 / ಉ 24
cluster   45 len    148    ~ 148
cluster   50 len     27    ~ 27
cluster   55 len     35    ೕ 35
cluster   60 len     92    ~ 92
cluster   46 len    172    ~ 172
cluster   61 len      1    ~ 1
Process PoolWorker-5:

Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 99, in worker
    put((job, i, result))
  File "/usr/lib/python2.7/multiprocessing/queues.py", line 390, in put
    return send(obj)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

cluster   65 len     65    ~ 65
cluster   56 len    123    ~ 76 / ೕ 13 / : 8 / ಃ 5 / ? 5
cluster   51 len     23    ~ 23
cluster   47 len    100    ~ 100
cluster   66 len     15    ~ 15
cluster   52 len     52    ~ 52
cluster   57 len    136    ~ 136
cluster   48 len    116    ~ 116
cluster   67 len     74    ~ 74
cluster   53 len     18    ~ 18
cluster   49 len     78    ಎ 78
cluster   58 len     85    ~ 85
cluster   68 len    104    ~ 104
cluster   70 len     49    , 49
cluster   54 len    116    ಳ 99 / ~ 17
cluster   59 len     58    ~ 58
cluster   69 len     82    ~ 82
cluster   75 len     34    ~ 34
cluster   71 len     47    ~ 47
cluster   76 len     12    ~ 12
cluster   72 len     60    ~ 60
cluster   77 len    138    ~ 138
cluster   73 len    125    ~ 123 / ೯ 2
cluster   78 len     44    ~ 44
cluster   74 len     66    ~ 65 / ೕ 1
cluster   79 len     43    ~ 43
With regards,
-sriranga(79yrs)


--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/TNhqRJQ-7sMJ.
Untitled Document
Untitled Document 1

Sriranga(78yrsold)

unread,
Aug 26, 2012, 6:29:41 AM8/26/12
to ocr...@googlegroups.com
Tom
followed your valuable guidance for for training Kannada - but could not completed. modified script also attached.
 where I made mistake?
Awaiting further valuable guidance. On hearing from you further information will be submitted.
With warmest regards,
-sriranga(79yrs)

Extract of terminal is reproduced below for ready reference:

dell@ubuntu:~/ocropus_6.0$ cd kannada-boxes/
dell@ubuntu:~/ocropus_6.0/
kannada-boxes$ ./run-box-training
================================================================
=== This script illustrates training of a simple, initial
=== character recognizer from the kind of boxdata training
=== files used with Tesseract.
================================================================

+ for image in 'kan/*.tif'
+ convert -depth 8 kan/alamandidoddavva.tif kan/alamandidoddavva.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g01.tif kan/g01.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g02.tif kan/g02.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g03.tif kan/g03.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g04.tif kan/g04.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g05.tif kan/g05.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g06.tif kan/g06.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g08.tif kan/g08.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g09.tif kan/g09.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g10.tif kan/g10.png
convert convert: kan/g10.tif: Read error at scanline 3407; got 4947830 bytes, expected 4948020. (TIFFFillStrip).

dell@ubuntu:~/ocropus_6.0/kannada-boxes$ ./run-box-training
================================================================
=== This script illustrates training of a simple, initial
=== character recognizer from the kind of boxdata training
=== files used with Tesseract.
================================================================

+ for image in 'kan/*.tif'
+ convert -depth 8 kan/alamandidoddavva.tif kan/alamandidoddavva.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g01.tif kan/g01.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g02.tif kan/g02.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g03.tif kan/g03.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g04.tif kan/g04.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g05.tif kan/g05.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g06.tif kan/g06.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g08.tif kan/g08.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g09.tif kan/g09.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g11.tif kan/g11.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g12.tif kan/g12.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g13.tif kan/g13.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g14.tif kan/g14.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/test4.tif kan/test4.png
+ ocropus-db tess2h5 kan/alamandidoddavva.png kan/g01.png kan/g02.png kan/g03.png kan/g04.png kan/g05.png kan/g06.png kan/g08.png kan/g09.png kan/g11.png kan/g12.png kan/g13.png kan/g14.png kan/test4.png -o boxdata.h5
['kan/alamandidoddavva.png', 'kan/g01.png', 'kan/g02.png', 'kan/g03.png', 'kan/g04.png', 'kan/g05.png', 'kan/g06.png', 'kan/g08.png', 'kan/g09.png', 'kan/g11.png', 'kan/g12.png', 'kan/g13.png', 'kan/g14.png', 'kan/test4.png']
=== kan/alamandidoddavva.png
=== kan/g01.png
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity

kan/g01.png 800 ’ 0 8723 0 8723

Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g01.png 820 ’ 0 8723 0 8723


=== kan/g02.png
=== kan/g03.png
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g03.png 345 ‌ 0 5538 0 5538


=== kan/g04.png
./run-box-training: line 31:  2643 Segmentation fault      (core dumped) ocropus-db tess2h5 kan/*.png -o boxdata.h5
dell@ubuntu:~/ocropus_6.0/kannada-boxes$


On Thu, Aug 23, 2012 at 2:50 PM, Tom <tmb...@gmail.com> wrote:
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/b3PL9Hi82UcJ.
run-box-training

Sriranga(78yrsold)

unread,
Aug 26, 2012, 7:36:44 AM8/26/12
to ocr...@googlegroups.com
Tom,
 Good News:
After deleting problematic set of  'go4' and 'go10' of tif/box/ref.txt from the kan folder - again rerun script "run-box-training" after corrected as convert kan/g01.tif page.bin.png. Now successfully 1st stage of training for kannada lang completed by the Grace of Supreme Lord.
Now I reached the 2nd stage of "ocropus-ngraphs". - for which i am searching commandline to be used.
With warmest Regards,
-sriranga(79yrs)



Extract of terminal is reproduced below:
./run-box-training: line 31:  2255 Segmentation fault      (core dumped) ocropus-db tess2h5 kan/*.png -o boxdata.h5
dell@ubuntu:~/ocropus_6.0/kannada-boxes$ clear


dell@ubuntu:~/ocropus_6.0/kannada-boxes$ ./run-box-training
================================================================
=== This script illustrates training of a simple, initial
=== character recognizer from the kind of boxdata training
=== files used with Tesseract.
================================================================

+ for image in 'kan/*.tif'
+ convert -depth 8 kan/alamandidoddavva.tif kan/alamandidoddavva.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g01.tif kan/g01.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g02.tif kan/g02.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g03.tif kan/g03.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g05.tif kan/g05.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g06.tif kan/g06.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g08.tif kan/g08.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g09.tif kan/g09.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g11.tif kan/g11.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g12.tif kan/g12.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g13.tif kan/g13.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/g14.tif kan/g14.png
+ for image in 'kan/*.tif'
+ convert -depth 8 kan/test4.tif kan/test4.png
+ ocropus-db tess2h5 kan/alamandidoddavva.png kan/g01.png kan/g02.png kan/g03.png kan/g05.png kan/g06.png kan/g08.png kan/g09.png kan/g11.png kan/g12.png kan/g13.png kan/g14.png kan/test4.png -o boxdata.h5
['kan/alamandidoddavva.png', 'kan/g01.png', 'kan/g02.png', 'kan/g03.png', 'kan/g05.png', 'kan/g06.png', 'kan/g08.png', 'kan/g09.png', 'kan/g11.png', 'kan/g12.png', 'kan/g13.png', 'kan/g14.png', 'kan/test4.png']
=== kan/g05.png
=== kan/g06.png
=== kan/g08.png
=== kan/g09.png
=== kan/g11.png
=== kan/g12.png
=== kan/g13.png

Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g13.png 36 | 0 10139 0 10139

=== kan/g14.png

Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 26 | 0 6069 0 6069


Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 466 ’ 0 6069 0 6069


Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 512 ’ 0 6069 0 6069


Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 1110 ’ 0 6069 0 6069


Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 1257 ’ 0 6069 0 6069


Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 1346 ’ 0 6069 0 6069


Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 1463 ’ 0 6069 0 6069


Traceback (most recent call last):
  File "/usr/local/bin/ocropus-db", line 158, in tess_readchars
    cimage /= amax(cimage)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1833, in amax
    return amax(axis, out)
ValueError: zero-size array to maximum.reduce without identity
kan/g14.png 1599 ’ 0 6069 0 6069

=== kan/test4.png

+ ocropus-tsplit --pca 0.95 --vq 80 -d boxdata.h5 -o boxdata.split
loading dataset
got 41735 samples out of 41735
# classes 83
most common ~ 25819 / ದ 1368 / ವ 1161 / ರ 987 / ಗ 963 / . 920 / ಯ 883 / ನ 835 / ಅ 761 / ಮ 661 / ...
starting training
 pcakmeans 41735 k 80 d 0.95
 predicting 41735 1024

writing
+ ocropus-tleaves -Q 4 -s boxdata.split -d boxdata.h5 -o boxdata.cmodel
loading splitter
got <ocrolib.patrec.HierarchicalSplitter instance at 0x4031488>

#splits 80
excluding [ _\000-\037]
sizemode perchar
loading dataset
sizemode (data) perchar
splitting
0
10000
20000
30000
40000
cluster    0 len    221    ~ 221
cluster    5 len    171    ~ 171
cluster   15 len    586    ~ 586
cluster   10 len   1833    ದ 1080 / ವ 753
cluster    1 len    521    ~ 404 / ಣ 117
cluster    6 len    712    ~ 441 / ಂ 193 / 0 71 / 9 4 / ೦ 3
cluster   16 len    503    ~ 503
cluster   11 len    589    ~ 589
cluster    2 len    491    ~ 491
cluster    7 len   1029    ರ 987 / ಠ 42
cluster   17 len    886    ~ 668 / ಇ 208 / ೧ 10
cluster   12 len    412    ~ 411 / ೕ 1
cluster    3 len    403    ~ 403
cluster    8 len    233    ವ 152 / ಪ 55 / ಹ 26
cluster   18 len    173    ~ 173
cluster   13 len    810    ಗ 809 / ~ 1
cluster    4 len    335    ~ 335
cluster    9 len    552    ಡ 189 / ಪ 177 / ಹ 145 / ಷ 41
cluster   19 len    663    ಳ 368 / ~ 286 / ೪ 9
cluster   14 len    590    ~ 436 / ಗ 154
cluster   20 len    115    ಳ 99 / ~ 16
cluster   25 len    241    ತ 241
cluster   30 len    238    ~ 238
cluster   35 len    516    ಕ 469 / ~ 47
cluster   21 len    357    ~ 279 / ಎ 78
cluster   26 len    126    ತ 126
cluster   31 len    119    ಇ 77 / ಸ 42
cluster   36 len    645    ~ 639 / 6 6
cluster   22 len    438    ~ 438
cluster   27 len    154    ತ 106 / ಹ 43 / ~ 5
cluster   32 len    218    ~ 213 / ೧ 4 / ೦ 1
cluster   37 len    488    ~ 484 / ೨ 2 / ೭ 2
cluster   23 len    919    ನ 835 / ಸ 84
cluster   28 len    116    ತ 116
cluster   33 len    361    ~ 361
cluster   29 len    358    ~ 358
cluster   38 len    428    , 426 / ' 2
cluster   24 len    600    ಸ 364 / ಶ 178 / ಕ 58
cluster   34 len    360    ~ 350 / ಘ 10
cluster   40 len    458    ~ 458
cluster   39 len   1425    ~ 1109 / ೕ 48 / ? 34 / 2 31 / ) 27
cluster   45 len    321    ~ 312 / ಊ 9
cluster   50 len    283    ~ 283
cluster   46 len    596    ~ 596
cluster   41 len    484    ~ 475 / 9 9
cluster   51 len    614    ~ 614
cluster   55 len    852    ಅ 428 / ಆ 424
cluster   47 len    705    ~ 705
cluster   42 len    267    ~ 266 / ೯ 1
cluster   52 len    920    . 920
cluster   56 len    583    ಅ 333 / ಲ 249 / ೮ 1
cluster   48 len    925    ~ 873 / - 51 / / 1
cluster   43 len    445    ~ 359 / ಕ 84 / ಔ 1 / ೯ 1
cluster   57 len    405    ಭ 125 / ಧ 120 / ~ 84 / ಥ 57 / ಫ 8
cluster   53 len    360    ~ 200 / ಚ 94 / ಜ 66
cluster   49 len    967    ~ 967
cluster   44 len   1141    ~ 1140 / ಾ 1
cluster   58 len   1236    ಬ 476 / ಜ 160 / ~ 159 / ಎ 136 / ಟ 71
cluster   54 len    429    ~ 429
cluster   60 len    989    ~ 671 / ವ 256 / ಪ 41 / ಚ 13 / ಧ 6
cluster   65 len    112    ಈ 112
cluster   70 len    142    ~ 142
cluster   59 len    364    ದ 288 / ಡ 74 / ಥ 2
cluster   61 len    462    ~ 462
cluster   66 len   1016    ~ 665 / ಮ 351
cluster   71 len     59    ~ 59
cluster   75 len    251    ~ 250 / ಘ 1
cluster   72 len    106    ಒ 72 / ಓ 31 / ೬ 3
cluster   62 len    666    ~ 585 / ಮ 81
cluster   67 len    135    ~ 135
cluster   73 len    565    ~ 565
cluster   76 len    585    ~ 583 / ಊ 2
cluster   63 len    356    ಮ 229 / ~ 127
cluster   68 len    527    ಯ 446 / ~ 79 / ಋ 2
cluster   77 len    601    ~ 596 / ಊ 5
cluster   69 len    568    ~ 474 / ಉ 94
cluster   64 len    447    ಯ 437 / ~ 10
cluster   74 len    844    ~ 844
cluster   78 len    694    ~ 677 / ಊ 17
cluster   79 len    320    ~ 319 / ು 1

writing
+ ocropus-db predict -m boxdata.cmodel boxdata.h5
62 41735 0.148556367557
+ convert kan/g01.tif page.bin.png

+ ocropus-gpageseg page.bin.png
page.bin.png
computing segmentation
computing column separators
computing lines
propagating labels
spreading labels
number of lines 80

finding reading order
writing lines
    79 page.bin.png 31.5 80
+ ocropus-lattices -m boxdata.cmodel page/010001.bin.png page/010002.bin.png page/010003.bin.png page/010004.bin.png page/010005.bin.png page/010006.bin.png page/010007.bin.png page/010008.bin.png page/010009.bin.png page/01000a.bin.png page/01000b.bin.png page/01000c.bin.png page/01000d.bin.png page/01000e.bin.png page/01000f.bin.png page/010010.bin.png page/010011.bin.png page/010012.bin.png page/010013.bin.png page/010014.bin.png page/010015.bin.png page/010016.bin.png page/010017.bin.png page/010018.bin.png page/010019.bin.png page/01001a.bin.png page/01001b.bin.png page/01001c.bin.png page/01001d.bin.png page/01001e.bin.png page/01001f.bin.png page/010020.bin.png page/010021.bin.png page/010022.bin.png page/010023.bin.png page/010024.bin.png page/010025.bin.png page/010026.bin.png page/010027.bin.png page/010028.bin.png page/010029.bin.png page/01002a.bin.png page/01002b.bin.png page/01002c.bin.png page/01002d.bin.png page/01002e.bin.png page/01002f.bin.png page/010030.bin.png page/010031.bin.png page/010032.bin.png page/010033.bin.png page/010034.bin.png page/010035.bin.png page/010036.bin.png page/010037.bin.png page/010038.bin.png page/010039.bin.png page/01003a.bin.png page/01003b.bin.png page/01003c.bin.png page/01003d.bin.png page/01003e.bin.png page/01003f.bin.png page/010040.bin.png page/010041.bin.png page/010042.bin.png page/010043.bin.png page/010044.bin.png page/010045.bin.png page/010046.bin.png page/010047.bin.png page/010048.bin.png page/010049.bin.png page/01004a.bin.png page/01004b.bin.png page/01004c.bin.png page/01004d.bin.png page/01004e.bin.png page/01004f.bin.png page/010050.bin.png
loading boxdata.cmodel
got <ocrolib.patrec.LocalCmodel instance at 0x3b52320>
sizemode perchar
loading /usr/local/share/ocropus/en-space.model
got <ocrolib.wmodel.WhitespaceModel instance at 0x3b614d0>
loading /usr/local/share/ocropus/en-mixed.lineest
got <ocrolib.lineest.TrainedLineGeometry instance at 0x3b61560>
segmenter lineseg.DPSegmentLine()
got <ocrolib.lineseg.DPSegmentLine instance at 0x3b61638>
recognizing 80 files
page/010001.bin.png =RAW= ಸಸ(ಹ೨ರ ಮತೕ (1ಂದಎಧಮಘ
page/010002.bin.png =RAW= ಇೕೕವ/1 ಎರಡ: ಕ~)ಗಳನ) ~ೕಳ:ಇಂ::ತ:: ೕಂಂದ:ತಎ ಚೕವರನ) ನಂಬ:ೕೕ~ಂದ: ಬ:ಬಸ:ಾಪ:ಬೕ? ೕಂಂದ:ತಎ ಸಸ(ಟಾ:)ದಬಗಧೕ~ಂದ:
page/010003.bin.png =RAW= ಬಯಸ:ಾಪಯೕ?
page/010004.bin.png =RAW= ವಂದಲೕ::ಬ ಔ~/1 ಸಂಬಂೕ;22ದಂ~, ನ೨ನ: ೕಂಂದ:ತಎ ಎಂದಗೕನ: ಮತೕ ದಬರ: ;:ಜಪ೨!1:ಬಎ ೕಂಂದ: ಎಂಬ ಕ:9ತ: ಚ:ೕಘ22:!ೕೕ:. ಸ೨ರ೨ಂಶ ರಎಪದ! ;
page/010005.bin.png =RAW= ವೕಳ:ೕೕ~ಂದಗ, ೕಂಂದ:ತಎ ಎನ)೯1ದ: ಸನ೨ತನ ಧಮಘ ಮತೕ ಆ ಧಮಘ ಅನ೨~ ಕ೨ಲದ( ~ಅದ: ವಂೕ1 ಸಂಪ೨ದೕ::ಬ ವಬಗಘ ಹ:ಡ:ಕ:೯1ದನ) ಒಳ/:ಂಂ~ಚ.
page/010006.bin.png =RAW= ಆಾ9ಂದ ದಬರ: ವಂೕ1 ಬ:ಬಸ:ತ~ಈಂೕ ಅವರ:~ಅವರ: ದಬವ ರ೨(2ೕ:ಬರ:, ಚ೨~:ಬವರ:, ಜನ೨ಂಗದವರ: ಅಥವ !1ಂಗದವರ: ಎಂಬ ಭೕಧಂ:2)ಚ
page/010007.bin.png =RAW= ;:ಜಪ೨ದ ೕಂಂದ:. ಆ 9ೕ~:ಬ ಂ:ಶ೨ಲ ಮೕ:ಎೕಭ೨ವದ ~ಳ:ವಳ~:ಬ!; ೕಂಂದ:ತಎ ಅನ)೯1ದ: :ೕೕವನದ 9ೕ~ ಆಗ:ಾಪ, ಏ~ಂದಗ :ೕೕವನದ ಮ:ಾ ಕ೨ೕ!ೕಶದ
page/010008.bin.png =RAW= ಹ:ಡ:ಕ೨ಟಢೕ ಟಂದಎ :ೕೕವನದ ಗ:9.
page/010009.bin.png =RAW= ವ:ೕ~ನ ೕಂ(:ೕಯೕ;, ಎ:7 ಇತರ ಕ~ಗಳನ), ೕಂಂದ:ತಎ ಸ!ಹ೨9ದಬ!1ರ:ೕೕ~ಂದ: ಬಯಸ:ಾಪಯೕ ಎಂಬ ಕ~~ೕ9ದಂ~, :::2ಸಲ:
page/01000a.bin.png =RAW= ಸ:ಲಭಢ;:ಸ:ಾಪ. :ೕೕವನದ ಕ೨ೕ!ೕಶ ಮ:~ಅಥವ ವಂೕ1ವನ) ಪಢ:ಬ:೯1ದ೨:2:)ಂದ ಅದನ) ಪಢ:ಬ:ವವಗ/1 ನ೨೯ 1 ಬದ: ಕರಧೕಕ:. 2ಖ೯ 1 ವಬಔ ವ೧ನ22ಕ
page/01000b.bin.png =RAW= ಚ೨)~ದಬದ೨ಗ ಬರ:ವ ನ೨ನತಎದ ಸ೨೯ 1 ತರಬ2)ದ:. ಆಾ9ಂದ ಪೕಹವನ) ಸಧ~ಢಪ೨!1 ಇಟೕ:ಂಳ(೯1ದ: ನಮಲ ಧಮಘ. ಅದರ ಅಥಘ ಬದ:ಕ!1ಕ)!1 ~ಾ:ೕೕಕ:
page/01000c.bin.png =RAW= (ಇಮ1)ಂದ: 9ೕ~ಯ~ಅ~~~ನ)ವ ಸಲ:~!1 ಬದ:ಕ:೯1ದ: !)
page/01000d.bin.png =RAW= :ೕೕವನ :ೕೕವದ ವ:ೕ:ೕ :ೕೕಂ:22ಪ. ಅದ: ಕಕ~~ಂಬ ೭ಂಬಮ. ನ೨ನ: ಒಂದ: ~ೕ:ಂಬನ) ~ನ)!ೕೕ:ಎೕ ಅಥವ ಒಂದ: ಸ!ವನ) ~ನ)!ೕೕ:ಎೕ, ಒ:)ನ!;
page/01000e.bin.png =RAW= ಒಂದ: :ೕೕವವನ) ನ೨ಶಪ:222ದಂ~:ಬೕ. ಎ:7 ಬ/1:ಬ :ೕೕವವಬಔರ! ; ವ೧ನವ ಇತರ :ೕೕಂ:ಗಳ!1ಂತ ::ಾ. ಅವ;:/1 ಒೕ~):ಬದ: ದಬ೯1ದ:, ~:ೕದ: ದಬ೯1ದ:
page/01000f.bin.png =RAW= ಎಂದ: ತ:ಲೕ: ವ೧ಡ:ವ ಶ~ಂ::ಪ. ಅದ: ಅವ;:/1 ಆಂ):ಕ:ಬ ಸ:ತಂಔ ಕವನ;" :ಂ:)ಚ. ಸ!ಗಳ: ~ೕವಲ ಪೕಹ ಮತೕ ಬಹ:ಷಃ ~ರಂ::ಕ ಹಂತದ ಮನಸ/
page/010010.bin.png =RAW= ಣಂಂ~ರಬಹ:ದ:. ~ೕ:ಗಳ: ಚೕಹ ಮತೕ ಭ೨ವೕ:ಗಳ: ಟಾಗಎ ೕ:ಎೕವನ) ಾ?ಪ~ಸ:ವ ಮನಸ/ ಢಎಂ~ರ:೯1ದರ ಯಂ~/1 ~ರಂ::ಕ ಹಂತದ ಬ:ೕ:ಶ
page/010011.bin.png =RAW= ಣಂಂೕ:ರಬಹ:ದ:. ವಬನವ ಪೕಹ ವಬಔವ2), ಮನಸ/ ಮತೕ ಪ:)~ೕ!1ಸ:ವ, ;:ಧಘ:)ಸ:ವ ಮತೕ ಆಂ):ಕ ವಬಡ:ವ ಕ೨ಾಮ :ೕಳವೕ:/1 ಢಎಂೕ:ದ
page/010012.bin.png =RAW= ಬ:ೕ:ಶ~:ಬಮ) ಣಂಂ~ದ)ೕ:. ಅವ;:/1 ದ೧ಪ೨ಗಲಎ ವಒಂರ: ಅವಕ೨ಶಗಳಢ ~ಕತ:ಘಂ ಶಾಂ, ಅಕತ:ಘಂ ಶಾಂ ತಥ೨ ಅ!ಥ೨ ಕತ:ಘಂ ಶಾಂ~ಅವನ:
page/010013.bin.png =RAW= :ೕೕ~ಂದಗ ವಬಡಬಹ:ದ:, ವಬಡೕ:ರಬಹ:ದ: ಮತೕ :ೕೕಗ :)ೕ~:ಬ! ; ವ೧ಡಬಹ:ದ:. ~ೕ:ಗಳ/1 ಮತೕ ಸ!ಗಳ/1 ಆಂ):ಕ:ಬ ಸ:ತಂಔ ಕಂ:ರ:೯1ೕ:2). ಅ೯ 1
page/010014.bin.png =RAW= ಸಹಜ 9ೕ~:ಬ! ; ವ~ಘಸ:ಾಢ. ಹಸ: ಕ"ಂಟದ ಮ:ಂಪ ಕ:ಳತ: ತ೨ನ: ಸ!ಹ೨9ದಬ!1ರ:ೕೕ"ಂೕ ಅಥವ ವ೧ಂಸ೨ಹ೨9ದಬ!1ರ:ೕೕ:ಂೕ ಎಂದ: ಂ:ಚ೨9ಸ:ತ
page/010015.bin.png =RAW= ಕ:ಳತ::ಂಳ(೯1~2). ಅಚೕ 9ೕ~ ಹ:! 1 ಸಹ೨. ಮನ:ಷ8;:/1 ಅಂತಹ ಂ:ವೕ:ೕಸ:ವ ಬ:ೕ:, ಶ~ಇಪ. ಸ!ಗಳ: ಮತೕ ~ೕ:ಗಳ: ತಮಲ 1:ಬಗಳ! ; ಪ೨ಪ
page/010016.bin.png =RAW= ವ೧ಡ:೯1~2),ಏ~ಂದಗ ಅ೯1ಗಳ 1:ಬಗಳ! ; ಇಚ(ಶ~ಇರ:೯1~2). ಆದಗ ಮನ:ಷ8ನ ಂ:ಷ:ಬದ! ; ಕಧ:ಬೕ :ೕೕಗ. ;:ಮ/1 ಈ ಚವಘ:ಬ! ; ಪ೨ಪವನ) ಏ~
page/010017.bin.png =RAW= ತಂಪ ಎಂದ: ;:ಮ/1 ಆ~೩:ಬಘಪ೨ಗಬಹ:ದ:. ಇರ!1, ಂ:ವ9ಸ:!ೕೕ:.
page/010018.bin.png =RAW= ಪ೨ಪ ಎಂಬ:ದ: ಮನ2;ನ ತ:ಮ:ಲಗೕ)ೕ ಢಎರತ: ಮ!ೕನಎ ಅ2). ಈ ತ:ಮ:ಲಗೕ)ೕ ವಂೕ1ಚಢ!1ನ ನಾ ಕದ೧ಣವನ) ಅ:2 ~ಪ:2ಸ:೯1ದ:. ನ೨ನ: ಸ7ವನ)
page/010019.bin.png =RAW= ಸ7ಪ೨/1ೕ ಕ೨ಣಲ: ಮನಸ/ ಶ:ಾಪ೨!1ರ:ೕೕಕ: (ಅಂದಗ ತ:ಮ:ಲಗಳರ:,ಬರದ:). (ಒ?"ಬ:- ಸಹ ದಬರ ಮನಸ/ಗಳ: ಶ:ಾಪ೨!1ಪಯಂೕ ಅವರ: ಅನ:ಾಟತರ:
page/01001a.bin.png =RAW= ಎಂದ: ಢೕಳ:ಾಚ.) ಪ೨ಪವನ) ಢಚೕ ಢ"ಚ(;:ಕಪ೨!1 ಂ:ವ9ಸ:ೕೕ~ಂದಗ ~ಅದ: ಮನಸ/ ಮತೕ ಬ:ೕ:ಶ~ಗಳ~ನ ಅಂತರ. ಬ:ೕ:ಶ~/1 ದ೧೯1ದ: ಸ9,
page/01001b.bin.png =RAW= ದ೧೯1ದ: ತಇ ಎಂದ: /:ಂೕರ:ಾಪ~ಆದರಎ ನಮ/1 ತ~ಂದ: /:ಂೕಾರಎ ಅದ(ೕ ವಬಡ:ವಂ~ ಆಗ:ಾಚ ~ಅಂದಗ ಬ:ೕ:ಶ~ಒಂದ: ವೕಳ:ಾಪ, ಆದಗ
page/01001c.bin.png =RAW= ಬ:ೕ:ಶ~ಢೕಳದಂ~ ~ೕಳ:ೕೕಕ೨ದ ಮನಸ/ ಬಂವದ( ತನ/1 :ೕೕಕ೨ದಂ~ ವ೧ಡ:ಾಪ. ಈ ಅಂತರವೕ ಪ೨ಪ. ಆ 9ೕ~ ವಬ~ದ ನಂತರ ಏೕ:ಎೕ ತಇ ವ೧~ದ ಭ೨ವ
page/01001d.bin.png =RAW= ವಒಂಡ:ಾಚ, ಏ~ಂದಗ ಬ:::ಶ~ತ೨ನ: :ಂೕ~ಾರಎ ಸ:ಮಲ;:ರ:೯1~2), ಅದ: ನ೨ನ: ಅದ: ತಇ ಎಂದ: ವೕಳಪ. ಆದರಎ ಅದನ) ಏ~ ವಬ~ಪ? ಎಂದ:
page/01001e.bin.png =RAW= ಚ:ಚೕಾ:ೕೕ ಇರ:ಾಪ. ಮನ2:ನ ಶ೨ಂ~ ಢಎೕದ ವ:ೕ: ೕ ಮನ:ಾ;:/1 ನರಕದ ಅನ:ಭವಪ೨ಗ:ಾಪ. ಮನ:ಷ8 ಪ೨ಪಕ)!1 ~:ಸ2)ಡ:೯1~2), ಪ೨ಪ~ಂದ
page/01001f.bin.png =RAW= ~:ಸ2)ಡ:ತ~ೕ:. ಅದರ ಕ:9ತ: :ೕಂ~22.
page/010020.bin.png =RAW= ಎ:7 ಯಂೕಗಗಳಎ, ;:ೕ೯ 1 ಾಾಪ೨!1 ಂ:ಮೕ1ಘ22ದಗ, ಚೕಹ, ಮನಸ/ ಮತೕ ಬ:ೕ:ಶ~ಂಬನ) ಒಗ;1~ಸ:ವ 1ೕಬಗೕ)ೕ ಆ!1ವ. ಒ:/ ಯಂೕ!1/1 ~ಆತ ಏನ:
page/010021.bin.png =RAW= ಯಂೕ:ೕಸ:ತ~ೕ:, ಏನ: ವಬತನ೨ಡ:ತ~ೕ: ಮತೕ ಏನ: ವಬಡ:ತ~ೕ: ಎ2(೯ಂ ಒಂಪೕ 9ೕ~:ಬದ೨!1ರ:ಾಪ, ಢಎಂದ೨ೕ:~ಂ::ರ:ಾಪ (ಮನಸ೨ ~ಪ೨ಚ೨ ~ಕಮಘಚ೨).
page/010022.bin.png =RAW= ನಮಲ ಂ:ಷ:ಬದ!;, ನ೨೯ 1 ಯಂೕ:ೕಸ:೯1ಚೕ ಒಂದ:, ಆದಗ ನ೨೯ 1 ಏನ: ಯಂೕ:ೕಸ:!ೕಢಯಂೕ ಅದನ) ಢೕಳ:ವ ಧ":ಬಘಂ:ರ:೯1~2), ನಮಲ ತ:ಟಗಳ: ನ೨೯1
page/010023.bin.png =RAW= ಯಂೕ:ೕ22ಾಕಕಂತ ::ಾಪ೨!1 :ೕೕಗೕನ~~ೕ ಢೕಳ:ಾವ; ನ೨೯ 1 ಆ~ದ ವಬತ: ಮತೕ ನಂತರ ನ೨೯ 1 ವಬಡ:ವ ~ಲಸ, ಮ/ ಅ~ಾತ(ಸ!~ಎಲ9 ಸಮನಎಯ~
page/010024.bin.png =RAW= ಇ2)ಢೕ ಇ2). ನ೨೯ 1 /:ಂಂದಲದ :ೕೕವನ ನಢಸ:!ೕ!ೕಢ. ನ೨೯ 1 ಇೕ:):/ರನ) ವಂೕಸ ವ೧ಡ:೯1ದರ ಪಎ~/1 ಇಮ) ~ಂೕಚ;:ೕ:ಬಢಂದಗ ನಮ ಲನ) ನ೨ವೕ
page/010025.bin.png =RAW= ವಂೕಸಪ~22"ಂಳ(ೕಾೕಢ, ಮತಾ( ~ಂೕಚ;:ೕ:ಬಢಂದಗ ನಮ/1 ಅದರ ಅ9ಢೕ ಬರ:೯1~2)!
page/010026.bin.png =RAW= ಈಗ ಒಂದ: ಹ:! 1 :ೕೕಧದಬ~~ಂದಗ ಅದ: ಪ೨ಪ ವ೧~ದಂ~ ಆಗ:೯1~2). ಏ~ಂದಗ ಅದರ ಬ:ೕ:ಶ~~ರಂ::ಕ ಹಂತದ!;ದ(, ಅದ: "ಂಲ/ವ ಮ:ಾ
page/010027.bin.png =RAW= :ಂ2):ೕೕ"ಂೕ ಅಥಪ೨ :ೕೕಡ೯ಂೕ, ನ೨ನ: ವ೧ಂಸ೨ಹ೨9ದಬ!1ರ:ೕೕ"ಂೕ ಅಥವ ಸಸ(ಹ೨9ದಬ!1ರ:ೕೕ:ಂೕ ~ಎಂದ: ಂ:ಮ~ಘಸ:ತ~ಕಎರ:೯1~2). ಅದ~ಕ
page/010028.bin.png =RAW= ಹ22ಪ೨ದ೨ಗ, ~ಕ~~ಕ :ೕೕ~~ ಈಢೕ922:ಂೕ~ಲ: ತಾ :ೕೕಧಯನ) :ಂಲ/ಾಪ ಮತೕ ~ನ)ಾಪ ಮತೕ ಅಗ7ಂ:ರ:ವಷೕ ~ಂದ ನಂತರ ಕ೨ೕ;ದ:ದನ) ::ಡ:ಾಚ.
page/010029.bin.png =RAW= ಅದ~ಕ ಢಎ:(:ಬಕತನ ಇ2). ಅದ: ಅದರ ಸಎಧಮಘ. ಅದ: ಸ:ಂದರ ಪ:)ಸರ ಾವ2)ಯನ) ಪ೨~ಸ:ಾಪ. ಮನ:ಾೕ:ಎ:/ ವ೧ಔ ತಾ ಣಂಾ:ಬಕತನೕ:ಂದ ಪ:)ಸರ
page/01002a.bin.png =RAW= ಾವ2)ಯನ) ನ೨ಶ ವಬಡ:ೕದ)ೕ:. ನ೨ನ: ಸಸ(ಹ೨:)ದಬ!1ರ:ೕೕ~ ಅಥವ ವ೧ಂ2ಣಹ೨:)ದಬಗ:ೕೕ~? ಎಂಬ ಔ~ಅವೕ:ಎ:/;:ಂದ ವಬಔ ಬರಲ: 2ಖಾ. ಆ ಕ
page/01002b.bin.png =RAW= ಏ~ ಬರ:ಾಪ?ಏ~ಂದಗ ಅವ;:/1 ಂ:ಢೕಚೕ: ವಬಡ:ವ ಬ:ೕ:ಶ~ಇಚ ಮತೕ ಅವನ: ತಾ ಣಂಾ ತ:ಂ::22:ಂೕ~)ಲ: ಇತರರನ) ೕ:ಂೕಂ::ಸಲ:
page/01002c.bin.png =RAW= ಬಯಸ~ರ:೯1ದ9ಂದ. ಆತ;:/1 ೕ:ಎೕ೯ 1 ಎಂದಗ ಏೕ:ಂದ: /:ಂೕಪ, ಏ~ಂದಗ ಆತ ಇತರರ: ಅವ;:/1 ೕ:ಎೕ೯1ಂಟ:ವ೧ಡ:,ಬರಚಂದ: ಬಯಸ:ತ~ೕ:. ಸ1ಗಳಎ ಸಹ
page/01002d.bin.png =RAW= *** FAILED (no bestpath) ***
page/01002e.bin.png =RAW= ಅದ: ಅತ(ಾಮ, ಆದಗ ಅದ: 2ಣಾಂ:2). :ೕೕವನ :ೕೕವದ ವ:ೕ:ೕ :ೕೕಂ:22ಪ ~ಅದ: ಔಕ~~:ಬ ;::ಬಮ. ಂ:ಢೕಚನ೨ಶ~:ಬ:ೕ~) ಮನ:ಷ8ನ೨!1 ನಾ ಪ೨ಔಢಂದಗ
page/01002f.bin.png =RAW= ನ೨ನ: ಬದ: ಕರ:ವ ಸಲ:ಪ೨!1 ಕಕ~~/1 ಕ;:ಾ ಹ೨;:ದ೧ಗ:ವಂ~ ೕ:ಎೕ:2:ಂಳ(೯1ದ:. ಸ!ಗಳ ೕ:ಎೕಂ:ನ ಬ/) 1 ನನ/1 ಾಾ ~ಳ:ವಳ~ಂ::2). ಆ:29ಂದ ಬದ:ಕ!1ಕ)!1
page/010030.bin.png =RAW= ~ನ)೯1ದ: ಮತೕ ~ಾ!1ಕ)!1 ಬದ:ಕ~ರ:೯1ದ: ;:ಣ೨ಘ:ಬಕ ಅಂಶಪ೨ಗ:ಾಚ.
page/010031.bin.png =RAW= ಭಗವಓ)ೕ~:ಬ!; ಕ~ಾ ಒೕ ವೕಳ:೯1ಚೕ~ಂದಗ ಒ:/ ಸ೨ಧಕ (ವಂೕ1ಕ)!1 ಹಂಬ!1ಸ:ವವನ:) ಎ:7 :ೕೕಂ:ಗಳ ಂ:ಷ:ಬದ! ; ಅನ:ಕಂಪ ಢಎಂ~ರ:ೕೕ~ಂದ:
page/010032.bin.png =RAW= (ಸವಘ ಭಎತ ಟ~ೕರಥ೨ಃ). ಆಧ(೩ಕ :ೕಳವೕ:/1ಯ~ಬ:ೕ:ಶ~ಹ9ತಪ೨ಗ:ತ~ಣಂೕಗ:ಾಪ~~ೕ3 ಬ:ೕ: ಸಎ3 ಬ:ೕ:ದಬಗ:ತ~ಣಂೕಗ:ಾಪ. ಅಂದಗ,
page/010033.bin.png =RAW= ಮನಸ/ ತಳಮಳರೕಂತ, ಶ೨ಂತ ಮತೕ ಸಎ~;::ಬಂಃತಪ೨ಗ:ಾಚ. ಇತರರ ೕ:ಎೕ೯1ಗಳ/1 ಾಂ~ಸ:ವ ಗ:ಣ ಸಹ :ೕೕ):ಬ:ತ~ಢಎೕಗ:ಾಪ. ಆಾ9ಂದ
page/010034.bin.png =RAW= ಸಸ(ಹ೨9ದಬಗ:೯1ದ: ಸಎ?.
page/010035.bin.png =RAW= ಸ೨ಂಔದ೨ಂ::ಕ ವಬಂಸ೨ಟಾ9ಗಳಎ ಸಹ ನ೨ಂ::ಗಳ: ಮತೕ :ೕಕ:ಕಗಳ: ಅಥವ ಇತರ ವಬನವ ~ೕ:ಗಳನ) ~ಾಲ: ಇಾಪಡ:೯1~2(! ಏ~? ಎ~ೕ ಆದರಎ
page/010036.bin.png =RAW= ವ೧ಂಸ ಅಂದಗ ವ೧ಂಸವೕ! ಆದಗ ಬಳ~ಂ::ಂದ ಅನ:ಕಂಪ ಧೕ)~ರ:ಾಪ. ವ೧ನವನಾೕ ಣಂೕಲ:ವ ಆದಗ ~ರಂ::ಕ ಬ:ೕ:ಶ~ಢಎಂ~ರ:ವ ಹಲ೯ 1 ಎರಡ:
page/010037.bin.png =RAW= ಕ೨!1ನ ~ೕ:ಗಳಢ. ಅ೯ 1 ~ೕ:ಗಳಂ~ ವ~ಘಸ:ಾವ. ಆದಗ ಂ:ಕಸನಪ೨ಗ:ತ~ಣಂೕದಂ~ ಅ೯1ಗಳ ಬ:ೕ:ಶ~:ೕೕ):ಬ:ತ~ಣಂೕಗ:ಾಪ ಎಂಬ:ದನ) ಗಮ;:22ದಗ
page/010038.bin.png =RAW= ಔಕ~~ಂ::ಂದ ಎಷೕ :ೕೕ:ಂೕ ಅಾನ) ಪಢದ: ಸಸ(ಹ೨9ದ೧!1ರ:೯1ದ: ಸಎ?ಢ;:ಸ:ಾಪ. ~ೕವಲ ನ೨ಲ/1:ಬ ಚಪಲವನ) ತ~ೕ: ಪ~ಸ:ವ ಸಲ:ಪ೨!1 ದ೧೯1ಪೕ
page/010039.bin.png =RAW= :ೕೕವಂ:ಧಗಳನ) ಘ೨22/:ಂಳಸ:,ಬರದ:.
page/01003a.bin.png =RAW= ಟಂದಎ ಒ:/ ಸಸ(ಹ೨9ದಬ!1ರ:ೕೕ~ೕ? ಈ ಕ~;:ಮಲ ಮನ2;ನ:7ಗ:1ೕ ಕ೨ಾಂ:22ರಬಹ:ದ:, ;:ಮ ಲ ಢಎಾಯನ) ತ~ೕ:ಪ~ಸಲ: ಇತರ :ೕೕವಗಳನ)
page/01003b.bin.png =RAW= ೕ:ಎೕಂ::ಸ:,ಬರಚಂಬ ಸಎೕ~;:ಮ/1 ಇಪ:ಬಂದ: ಭ೨ಂ:ಸ:ವ. ಟಾ!1ಾ! ; ;:ೕ೯ 1 ವಬಂಸ೨ಹ೨ರ~ಂದ ದಎರಂ:ರ:೯1ದ: ಕ೨ಾಮ ಮತೕ ಆಗ ;:ೕ೯ 1 ;:ಮಲ
page/01003c.bin.png =RAW= ಬ:ೕ:ಶ~ಯಂಂ~/1 ಶ೨ಂತಪ೨!1ರಬಹ:ದ:. ;:ೕ೯ 1 ಸಎ3ಭ೨ವೕ:ಯ:ೕ~ವರ೨!1ಾಗ, ;:ಮಲ ಬ:ೕ:ಶ~ಢೕಳ:೯1ಪೕ ಒಂದ೨ದಗ ಮತೕ ;:ಮಲ ಮನಸ/ ~ಳ
page/01003d.bin.png =RAW= ಸಂ4ಂೕಷಗಳನ) ಬ:ಬ22 ಧೕಗ ದ೨9:ಬ! ; ;:ಚೕಘ~22ದಗ ಮತೕ ;:ೕ೯ 1 ;:ಮ ಲ ಬ:ೕ: ವೕಳದ ದ೨9 ::ಟೕ ನಢದಗ ;:ೕ೯ 1 ಪ೨ಪ ವಬ~ದಂ~ ಆಗ:ಾಪ. ಕ~ಾ
page/01003e.bin.png =RAW= ವೕಳದಂ~ ಅದ: ;:ಮಲಸಎಧಮಘದ ಂ:ರ::!ಪ೨ಗ:ಾಚ. ಪಎ~/1, ಈಗ ಸ೨ಂಔದ೨ಂ::ಕ ವಬಂಸ೨ಟಾ:)ಗಳಎ ಸಸ(ಟಾರ ಆಂ):ಕ ವ೧:2:ಂಳ(ೕದ)ಗ, ಇತರ ~ೕ:ಗಳ
page/01003f.bin.png =RAW= ಬ/) 1 ಅನ:ಕಂಪ~ಂದ2), ಆದಗ ಅದ: ತಮಲ ಆಈಂೕ)~ಕ ಒೕ~)ಂಬದ2)ವಂಬ ಕ೨ರಣಕ)!1.
page/010040.bin.png =RAW= ನ೨ನ೨ಗ:ೕೕ ~ಳ22:!ೕೕ:, ೕಂಂದ:ತಎದ! ; ೕಂೕ/1 ವಬ~ಮತೕ ೕಂೕ/1 ವ೧ಡಧೕ~ಎಂಬ:~2), ಆದಗ ;:ೕ೯ 1 ;:ಮ ಲಸಎಂತದ ವಬಡಬಹ:ದ೨ದ ಮತೕ
page/010041.bin.png =RAW= ವ೧ಡ:ಬರದ ಸಂಗ~ಗಳನ) ;:ಮಲ ಬ:ೕ:ಶ~ಯ ವೕೕಲ~, ಸಂಸಲ~, ~1ಣ ಮತೕ ಬದ: ಕನ ~ಥಂ::ಕ ಗ:9 ಆಧ922 ;:ಧಘ922:ಂಕ~). ;:ಮಲಸಎಧಮಘವನ) ;:ೕ೯1
page/010042.bin.png =RAW= ಪ೨!1ಸ:೯1ದ9ಂದ ;:ೕ೯ 1 ;:ವಂಲಂ~/1 ಸವ೧ಧ೨ನ~ಂ~ರ:೯1ದನ) ಕಂಡ::ಂಳ(ಂ:9. ಅದನ) :ೕೕಗ:ಬವರ: ;:ಣಘಂ::ಸ:೯1ದ2), ;:ೕಢೕ ;:ಣಘಂ::22:ಂಳ(೯1ದ:.
page/010043.bin.png =RAW= ;:ೕ೯ 1 ತಳಮಳ~ಕ ಒಳ/ಣ~ಗಂದಗ ಅದರ ಅಥಘ ;:ೕ೯ 1 ಅದ9ಂದ ಮನ2:ನ ಶ೨ಂ~ ಕೕ)ದ::ಂಳ(ೕರ:ಂ:ಗಂದ: ಮತೕ ಅಪೕ ಪ೨ಪ. ;:ಮಲನ) ;:ೕಢೕ, ;:ೕ೯ 1 ~ನ)ವ
page/010044.bin.png =RAW= :ಂೕೕ; ಅಥವ ಹಸ: ಎಂ~ಟೕ:ಂಕ~). ;:ೕ೯ 1 ;:ಮ ಲನ) ~ನ)ವವರನ) ಸಸ(ಹ೨9ಗಳ೨ಗಲ: ಮತೕ ;:ಮಲ :ೕೕವ ಕ೨ೕ;ಸಲ: ಢೕಳ:೯1~2)ಢೕ? ;:ೕ೯ 1 ~ೕ:ಂಬನ)
page/010045.bin.png =RAW= ;:ೕವೕ :ಂಲ/!2)ಢಂದಎ ಮತೕ ;:ೕ೯ 1 ~ಂದರಎ, ~ಾ~ಾಈಂ ~ೕ:ಗಳ: :ಂ:ೕದಬಗ:೯1೯ 1 ಎಂದ: ಢೕಳ~9. ;:ೕ೯ 1 ~ಾ~ಾಗ ಒಂದ: ~ೕ: ಕ೨ೕ;ಂಬ:ಾಪ.
page/010046.bin.png =RAW= ಅಪೕ :ೕೕ~~ ಮತೕ ೯ಂಗ"~. ನ೨ನ: ಸಎತಃ ಕ~:ಬ~ರಬಹ:ದ:, ಆದಗ ಕಾ ವಬಲನ) ಅದ: ಕಾವಬಲ: ಎಂದ: /:ಂೕದ( "ಂಂಡಗ, ಅದ: ಅಪರ೨ಧ! ಅ2)ವೕ?
page/010047.bin.png =RAW= ಈಗ ಕ~ತಕ ವ೧ಂಸಗಳಎ ಪಎಗ:ಬ:ಾಢ~ಆ:29ಂದ ಸಾ ವ೧ಂಸ~ಕೕ~ ಹಂಬ!1ಸ:ೕೕಕ: ?;:ಮ ಲ ಣಂಾ:ಬನ) ಸಾ ~ೕ::ಬ ಸಲಶ೨ನಪ೨ಗಧೕ~ಂದ: ಏ~
page/010048.bin.png =RAW= ಬಯಸ:ಂ:9?
page/010049.bin.png =RAW= ೕಂಂದ:ತಎದ ದ~"ೕಂ::ಂದ ೕ:ಂೕ~ದಗ ಅದ: ಇದ(:7 ಗಮ;:ಸ:೯1ಚೕ ಇ2). ಅದ: ಬ:ಬಸ:೯1ಪೕೕ:ಂದಗ ಸನ೨ತನ ಧಮಘದ ದ೨9:ಬ! ; ನಢ:ಬ:ೕೕ~ಂದಕ!.
page/01004a.bin.png =RAW= ಆಾ9ಂದ ;:ಮ ಲ ಮನ)ನ) ಶ೨ಂತಪ೨!1 ಮತೕ ತಳಮಳರಟತಪ೨!1ರ:ವಂ~ ~ಂೕ~:ಂೕ~)ಲ: ಏನ: ವಬಡ:ೕೕ"ಂೕ ಅದನ) ವಬ~. ಮನ2;ನ ಶ:ಾ~:ಬೕ ವಂೕ1
page/01004b.bin.png =RAW= ಪಢ:ಬ:ವ ದ೨9 ಮತೕ ಅದ: ವಬನವ :ೕೕವನದ ಗ:9. ನ೨೯ 1 ಇಾಪಟೕ ವ೧~ದ 1:ಬಗಳಂದ ಈ ಸಂಸ೨ರದ ಜಂಚ೨ಟ~ಕ 22ಲ: ಕದ(, ನ೨೯ 1 ಇಾಪಟೕ ವಬಡ:ವ
page/01004c.bin.png =RAW= ಸ೨ಧೕ:ಯಂಂಚೕ ಅದ9ಂದ ಢಎರಬರಲ: ಸ೨ಾ. ಭಗವಂತ ನಮ/1 ಇದನ) ಸ೨:;ಸಲ: ಬ:::ಶ~:ಂ:)ದ)ೕ:. ಕ~9 ~ಎೕಏ22ದ)ೕ: ~ಪರಧಮಘಕಕಂತ ಸಎಧಮಘದ
page/01004d.bin.png =RAW= ಸ೨:ಬ:೯1ದ: ವ:ೕಲ:. ಸಎಧಮಘ(ಅದ: ;:ೕ೯ 1 ದಬವ ಚ೨~ ಅಥವ ಧಮಘ~ಕ ~ೕ9ರ:ಂ:9 ಎಂಬ:ದ2)) ಎಂದಗ ಅಂ~ಮಪ೨!1 ;:ಮಲ ಬ:ೕ:ಶ~ಅಥವ ಅಂತಃಸ೨:
page/01004e.bin.png =RAW= ಏನ: ಢೕಳ:ಾಚಯಂೕ ಅದ:. ಏ~ಂದಗ,1:ಬ ವಬ~ದ ನಂತರ ;:ಮ ಲ ಮನ2(ೕ ;:ಮಲ ಬ:ೕ:ಶ~ಯಂಂ~/1 :ೕಕಕ ಇ7ಥಘ ವಬ~:ಂೕ~:ೕೕಕರ:೯1ದ:.
page/01004f.bin.png =RAW= ;:ಮ/1 ;:ೕವೕ ಒಂದ: ಕ೨ಪಕ೨ರ ವಬ~:ಂಕ~)9 ~;:ಮ/1 ದಬ೯1ದ: ಅಗ7ಂ:ಚಯಂೕ ಅದನ) ~;) ಮತೕ ದಬ೯1ದ: ;:ಜಪ೨!1:ಬಎ ಅಗ7ಂ:2)೯ಂೕ ಅದನ)
page/010050.bin.png =RAW= ::ಟೕ::~. ಆ 9ೕ~:ಬ! ; ;:ಮಲ :ೕೕವನವನ) ಸರಳ, ಶ೨ಂತ ಮತೕ ಕ~ವ: :ಂ:ೕ2ರ೨:- ಮಂಂ~/1 ಆಈಂೕ)ವಂತಪ೨!1 ವಬ~:ಂಕ~).
+ set +x

================================================================
=== You now have a simple Fraktur model, boxdata.cmodel.
===
=== This is only an initial model.  It isn't using any baseline
=== information.  The next training step consists of retraining
=== the model by aligning text lines with ground truth (see the
=== example in uw3-500).
===
=== In addition, you probably should construct a language model.
=== You can do that with ocropus-ngraphs.
================================================================
dell@ubuntu:~/ocropus_6.0/kannada-boxes$

 

===========================END ====================

Tom

unread,
Aug 26, 2012, 7:47:57 AM8/26/12
to ocr...@googlegroups.com
Hi,

I'm not sure what the source of that problem is.  Can you submit a bug report, please?  I'll have to see whether I can reproduce this.

Tom

Sriranga(78yrsold)

unread,
Aug 26, 2012, 9:25:36 AM8/26/12
to ocr...@googlegroups.com
Which email you are referring to problem reported by me. it is difficult to pick out which  problem you are  looking into. In fact I had attached relevant documents - unfortunately could  not send due to mailer demons restricting to 5mb in total. Hence i stopped practice of sending attachement to forum. Even in under "issues" there is no provision to upload the attached files by users. I had preserved everything till yesterday and deleted all in the night itself.
In future I shall submit with attached files if any direct to you for needful as deemed fit at your end. I tried to recreate the problem but could not recollect what i have done due to my overage  Sorry for disappointing you. Today  wanted to forward the relevant files generated for kannada - but mailer withheld it.  Anyway I am gradually picking up - by the Grace of Supreme Lord.
With warmest regards,
-sriranga(79yrs)

To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/7f7ShaJYVr0J.

zdenko podobny

unread,
Aug 28, 2012, 4:44:10 PM8/28/12
to ocr...@googlegroups.com
Hi,

I tried it openSUSE 12.1 and Mandriva Linux 2011.0

Installation on openSUSE was fine - without problem.
Installation on Mandriva Linux was with problem:

$ sudo python setup.py install
[sudo] password for zdeno: 
No protocol specified
/usr/lib64/python2.7/site-packages/gtk-2.0/gtk/__init__.py:57: GtkWarning: could not open display
  warnings.warn(str(e), _gtk.Warning)
/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_gtk.py:49: GtkWarning: IA__gdk_cursor_new_for_display: assertion `GDK_IS_DISPLAY (display)' failed
  cursors.MOVE          : gdk.Cursor(gdk.FLEUR),
Traceback (most recent call last):
  File "setup.py", line 7, in <module>
    from ocrolib import default
  File "/mnt/vg_u02/download/ocropus/ocropy/ocrolib/__init__.py", line 12, in <module>
    from common import *
  File "/mnt/vg_u02/download/ocropus/ocropy/ocrolib/common.py", line 11, in <module>
    import improc
  File "/mnt/vg_u02/download/ocropus/ocropy/ocrolib/improc.py", line 8, in <module>
    import sl
  File "/mnt/vg_u02/download/ocropus/ocropy/ocrolib/sl.py", line 6, in <module>
    from pylab import mean
  File "/usr/lib64/python2.7/site-packages/pylab.py", line 1, in <module>
    from matplotlib.pylab import *
  File "/usr/lib64/python2.7/site-packages/matplotlib/pylab.py", line 264, in <module>
    from matplotlib.pyplot import *
  File "/usr/lib64/python2.7/site-packages/matplotlib/pyplot.py", line 95, in <module>
    new_figure_manager, draw_if_interactive, _show = pylab_setup()
  File "/usr/lib64/python2.7/site-packages/matplotlib/backends/__init__.py", line 25, in pylab_setup
    globals(),locals(),[backend_name])
  File "/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_gtkagg.py", line 10, in <module>
    from matplotlib.backends.backend_gtk import gtk, FigureManagerGTK, FigureCanvasGTK,\
  File "/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_gtk.py", line 49, in <module>
    cursors.MOVE          : gdk.Cursor(gdk.FLEUR),
RuntimeError: could not create GdkCursor object

If  I got it correctly this is because root (or other user) can not open (gtk) window under "zdeno" login. Solution is to add following lines[1] to setup.py before line 'from ocrolib import default':

import matplotlib
matplotlib.use('Agg')


'./run-test' produced this error (openSUSE and Mandriva):
...
  File "/usr/local/lib/python2.7/site-packages/ocrolib/morph.py", line 50, in r_dilation
    return filters.maximum_filter(image,size,origin=origin)
NameError: global name 'filters' is not defined

I fixed it by adding "from scipy.ndimage import filters" to ocrolib/morph.py.

On Mandriva I got fatal error due to missing "omp.h". So I need to install lib64gomp-devel to fix it (in openSUSE this file is part of gcc46 package).

Then I got  another error (on Mandriva):
...
+ ocropus-ngraphs 'temp/????/??????.lattice'
Traceback (most recent call last):
  File "./ocropus-ngraphs", line 234, in <module>
    args.lmodel = ocrolib.findfile(args.lmodel)
  File "/mnt/vg_u02/download/ocropus/ocropy/ocrolib/toplevel.py", line 204, in argument_checks
    result = f(*args,**kw)
  File "/mnt/vg_u02/download/ocropus/ocropy/ocrolib/common.py", line 462, in findfile
    raise IOError("file '"+path+"' not found in . or /usr/local/share/ocropus/")
IOError: file '/usr/local/share/ocropus/en-mixed-4.ngraphs' not found in . or /usr/local/share/ocropus/

This is strange, because all files are installed into /usr in Mandriva (in openSUSE it is installed to /usr/local), but ./run-test expect files in /usr/local... I created symlink as workaround. Then ./run-test works ok on Mandriva.


I hope this help somebody. And I would suggest to add some kind of integrity check for downloaded models ;-)


Best regards,
--
Zdenko

Tom

unread,
Sep 4, 2012, 7:59:36 PM9/4/12
to ocr...@googlegroups.com
If  I got it correctly this is because root (or other user) can not open (gtk) window under "zdeno" login. Solution is to add following lines[1] to setup.py before line 'from ocrolib import default':

import matplotlib
matplotlib.use('Agg')

I think this makes some of the debugging features fail.  Some of the scripts have workarounds that use Agg only if there is no DISPLAY variable set.  If you submit a bug report, I can try to make that usage more consistent. 



'./run-test' produced this error (openSUSE and Mandriva):
...
  File "/usr/local/lib/python2.7/site-packages/ocrolib/morph.py", line 50, in r_dilation
    return filters.maximum_filter(image,size,origin=origin)
NameError: global name 'filters' is not defined

I fixed it by adding "from scipy.ndimage import filters" to ocrolib/morph.py.

I think that's fixed.

This is strange, because all files are installed into /usr in Mandriva (in openSUSE it is installed to /usr/local), but ./run-test expect files in /usr/local... I created symlink as workaround. Then ./run-test works ok on Mandriva.

I hope this help somebody. And I would suggest to add some kind of integrity check for downloaded models ;-)

I don't know how to get information about a platform's preferred install paths back to OCRopus so that it can find its files.  That's why, right now, there are hardcoded default install paths, plus an option to override them via environment variables.  If you know how to do this better, please let me know.

Tom

zdenko podobny

unread,
Sep 17, 2012, 10:05:41 AM9/17/12
to ocr...@googlegroups.com
On Wed, Sep 5, 2012 at 1:59 AM, Tom <tmb...@gmail.com> wrote:
If  I got it correctly this is because root (or other user) can not open (gtk) window under "zdeno" login. Solution is to add following lines[1] to setup.py before line 'from ocrolib import default':

import matplotlib
matplotlib.use('Agg')

I think this makes some of the debugging features fail.  Some of the scripts have workarounds that use Agg only if there is no DISPLAY variable set.  If you submit a bug report, I can try to make that usage more consistent. 

OK. I created issue 365.




'./run-test' produced this error (openSUSE and Mandriva):
...
  File "/usr/local/lib/python2.7/site-packages/ocrolib/morph.py", line 50, in r_dilation
    return filters.maximum_filter(image,size,origin=origin)
NameError: global name 'filters' is not defined

I fixed it by adding "from scipy.ndimage import filters" to ocrolib/morph.py.

I think that's fixed.
 
Yes. thanks.

This is strange, because all files are installed into /usr in Mandriva (in openSUSE it is installed to /usr/local), but ./run-test expect files in /usr/local... I created symlink as workaround. Then ./run-test works ok on Mandriva.

I hope this help somebody. And I would suggest to add some kind of integrity check for downloaded models ;-)

I don't know how to get information about a platform's preferred install paths back to OCRopus so that it can find its files.  That's why, right now, there are hardcoded default install paths, plus an option to override them via environment variables.  If you know how to do this better, please let me know.

Tom

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/kta9V81hNhQJ.

Tom

unread,
Sep 25, 2012, 4:21:31 AM9/25/12
to ocr...@googlegroups.com
I should add that, as a workaround, you can always just create an X server with "tightvncserver" or "Xvfb" and set your DISPLAY environment variable to that, even on a system without a display.

Tom

Bernd Fallert

unread,
Nov 13, 2012, 8:19:57 AM11/13/12
to ocr...@googlegroups.com


Am Mittwoch, 5. September 2012 01:59:36 UTC+2 schrieb Tom:
If  I got it correctly this is because root (or other user) can not open (gtk) window under "zdeno" login. Solution is to add following lines[1] to setup.py before line 'from ocrolib import default':

import matplotlib
matplotlib.use('Agg')

I think this makes some of the debugging features fail.  Some of the scripts have workarounds that use Agg only if there is no DISPLAY variable set.  If you submit a bug report, I can try to make that usage more consistent. 



'./run-test' produced this error (openSUSE and Mandriva):
...
  File "/usr/local/lib/python2.7/site-packages/ocrolib/morph.py", line 50, in r_dilation
    return filters.maximum_filter(image,size,origin=origin)
NameError: global name 'filters' is not defined

I fixed it by adding "from scipy.ndimage import filters" to ocrolib/morph.py.

I think that's fixed.
But its not transfered to the repository. So if i checked the 0.6-Version out the error is still there.  

Bernd

Tom

unread,
Nov 20, 2012, 2:10:41 AM11/20/12
to ocr...@googlegroups.com
Changes don't apply retroactively to old versions. To get the fix, you need to check out a more recent version.

It might be nice to have a 0.6.1. maintenance release including such bug fixes, but we haven't bothered. The current repository tip should work reasonably well.

Tom

Zdenko Podobný

unread,
Nov 13, 2012, 4:50:04 PM11/13/12
to ocr...@googlegroups.com
On 13.11.2012 14:19, Bernd Fallert wrote:
>
> Am Mittwoch, 5. September 2012 01:59:36 UTC+2 schrieb Tom:
>>
>>> './run-test' produced this error (openSUSE and Mandriva):
>>> ...
>>> File "/usr/local/lib/python2.7/site-packages/ocrolib/morph.py", line
>>> 50, in r_dilation
>>> return filters.maximum_filter(image,size,origin=origin)
>>> NameError: global name 'filters' is not defined
>>>
>>> I fixed it by adding "from scipy.ndimage import filters" to
>>> ocrolib/morph.py.
>>>
>> I think that's fixed.
>>
> But its not transfered to the repository. So if i checked the 0.6-Version
> out the error is still there.
>
> Bernd
>
I see it there[1]. You can also have a look at commit[2]. Do you use
correct repository (ocropy)?

[1]
http://code.google.com/p/ocropus/source/browse/ocrolib/morph.py?repo=ocropy#7
[2]
http://code.google.com/p/ocropus/source/detail?spec=svn.ocropy.ea3b103aa9b6e3cab71f939ce6c5a2a97136bcba&repo=ocropy&r=6e981b313f105386a2a6fccb925f81ed1ff8f721


Reply all
Reply to author
Forward
0 new messages