However, I've hit a roadblock as I only have the images without the exact text (ground truth) or bounding boxes. Creating this data manually seems like a massive undertaking—do you recommend this as the best course of action? Or, are there other solutions or perhaps existing prepared datasets for German that I could use?
Also, I'm curious about the volume of training data required. Is there a minimum number of images and corresponding texts that you'd consider sufficient to start seeing improved results?
Any guidance or resources you can provide would be greatly appreciated.
Atef
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/aeed1be3-e759-454f-89b5-ff3f0282d9a8n%40googlegroups.com.