I have test set that only has "uppercase English alphabets" and "numbers". But the provided eng.traineddata returns symbols and lower case alphabets sometimes. Is there a way to modify the existing traineddata file so that it only reads upper case alphabets and numbers?
thanks in advance
Is there a way to give TesseractEngine a hint of expected text format? For example, can I set a format like 00XXX00 XX-000 where 0 represents number and X represents alphabet?