Hi everyone,
Does anyone know what kind of method/algorithm is being used in the tesseract's fully automated page segmentation?
I am specifically interested in the segmentation portion rather than any other pre-processing steps, such as deskewing or noise-removal process. I have tried really hard to find any documentation that might specify the sequence of its process or the algorithm is based on particular paper, etc.
Thank you.