You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
The text I want Tesseract to read will only contain the most basic characters. Is there a way of finetuning it therefore so as to only include basic upper/lower case letters, digits and punctuation marks? That way I could avoid 'c' getting misinterpreted as '¢' etc.? Would simply passing in a new 'training_text' and 'wordlist' into tesstrain.sh achieve this?