Improving Japanese Text Detection

Mayank Agrawal

Aug 9, 2022, 12:11:16 PM
tesseract-ocr
I currently use Google cloud vision OCR, But it is not able to detect Ichi kanji(A Japanese Character) and The Accuracy is quite poor for Vertical Fonts(tategaki).
The same is the case with tesseract 4.0.
I want to ask if Traning on Tesseract for Ichi Kanji in general and Vertical fonts will significantly improve the accuracy of both of them. If Yes, How Many datasets is required to achieve a significant Improvement in accuracy. Which Tresseract version should I use 4.0 vs 5.0
