tessearct 5 trainning

100 views
Skip to first unread message

Ali hussain

unread,
Aug 11, 2023, 5:53:22 AM8/11/23
to tesseract-ocr
I use the below code to create the tif, gt, txt, and .box files. so I need to know what the best scenario of xsize/ysize of. how many words should be in per line on the training_text file? if I make a long line and set xsize/ysize as I need to cover the text can affect any issue? or tell me the ideal size of it? thx in advance.

   subprocess.run([
                'text2image',
                f'--font={font}',
                f'--text={line_gt_text}',
                f'--outputbase={output_directory}/{file_base_name}',
                '--max_pages=1',
                '--strip_unrenderable_words',
                '--leading=36',
                '--xsize=3500',
                '--ysize=500',
                '--char_spacing=1.0',
                '--exposure=0',
                '--unicharset_file=langdata/ben.unicharset',
            ])
Reply all
Reply to author
Forward
0 new messages