once again, with more information:
I have a problem using tesseract with german fraktur.
I work with tesseract 3.02.02 on SUSE Linux 13.2
firstly the text to be ocr'd is real printed text of about 1930.
the printing is a little dirty i.e. there are little points and strokes between
the letters.
though these are far smaller than the other letters, they are interpreted as
normal letters.oes-frak.frak.exp017
Is there a possibility to give parameters to tesseract that it
. either should neglect letters which do not fit the majority of the other
letters,
. or it should only use letters in a given range of size
. or to firstly make the boxes,
then correct the boxes, by hand or program,
finally translate using the corrected boxes
I have already tried with a config-file to modify
textord_min_xheight 24
textord_xheight_mode_fraction 0.9
textord_xheight_error_margin 0.1
textord_descx_ratio_min 0.3
tessedit_redo_xheight FALSE
it changes some things but nothing to neglect the points and strokes
following an example:
the appended picture is translated to the text
15 Ellser Exdmsund Mögsgzerg