Get the coordinates of lines

128 views
Skip to first unread message

Rémi Caruyer

unread,
Apr 10, 2013, 9:09:16 AM4/10/13
to ocr...@googlegroups.com
Hello,

I want to retrieve the values ​​x0, y0, width and height for each row. How can I retrieve these values ​​(file-ocropus gpageseg)?

I tried this:

In ocropus-gpageseg :

  for i,l in enumerate(lines):
        binline
= psegutils.extract_masked(1-cleaned,l,pad=args.pad,expand=args.expand)
        ocrolib
.write_image_binary("%s_ligne%04d.png"%(outputdir,i+1),binline)
        chaine
= ocrolib.write_coord(segmentation,i+1)
        fichier
= open("%s_ligne%04d.txt"%(outputdir,i+1), 'w')
        fichier
.write(chaine)
        fichier
.close()

And in common.py :

def write_coord(pseg,num):
    regions
= ocrolib.RegionExtractor()
    regions
.setPageLines(pseg)
    x0
,y0,x1,y1 = (regions.x0(num),regions.y0(num),regions.x1(num),regions.y1(num))
    height
= y1 - y0
    width
= x1 - x0
    chaine
= str(x0) + " " + str(width) + " " + str(y0) + " " + str(height)
   
#chaine = "12 24 15 26"
   
return chaine


But I have an error : AttributeError: RegionExtractor instance has no attribute 'comp'


How can I retrieve this coordinates?

Thank you.

Tom

unread,
Apr 10, 2013, 9:18:58 PM4/10/13
to ocr...@googlegroups.com
There's a bug in the x0/y0/x1/y1 methods (I'll push a fix later).

Just use 

y0,y1,x0,x1 = regions.bbox(num) 

instead.

Tom

Rémi Caruyer

unread,
Apr 11, 2013, 4:38:03 AM4/11/13
to ocr...@googlegroups.com
Thank you, it works perfectly!


Mona

unread,
Jul 14, 2013, 6:17:52 AM7/14/13
to ocr...@googlegroups.com

Hi Remi,

I'm looking into obtaining the bounding boxes of a page and tried out your code above. I tried this:

Added your code to the ocropus-gpageseg
Created a new python file for write_coord

I've never used Python before, but after looking up some of the commands you've used. I think you're expecting to see 4 columns of numbers in a text file? 

But, I don't see any results. Is this alright?
Reply all
Reply to author
Forward
0 new messages