Russian + English characters recognition

Skip to first unread message

Константин Михеев

Feb 28, 2023, 7:35:40 AM2/28/23
to tesseract-ocr
Hello. I have a problem with recognition of this image (attached this image). What I can change in the settings?

I used 2 variants:  
$res_rus = $ocr->lang('rus')->run();
$res_eng = $ocr->lang('eng')->run();

I received this text:

  1. AAC 6007896622 AD "OCK" Aeictayer NepoA NENONb30BaHMA TC aKTMBEH Ha 3aNpaLmBaemMyo AaTy Mapka 1 Mogens TPAHCMOPTHOTO CPEACTEA Ford MONDEO (kaTeropua (kaTeropms "X") B2} TocyAapeTEentbA e PerMCTPALMOHHEI 3HAK VIN WFODXXGB****x928 Homep kyzoea WFODXXGB****x928 MOLHOCTb ABMraTens 417 G Kateropun B, n.c. Het [ sk ook 11.06.1990 Camapckan o6, © Ceizpan OrpaHIUEH CIMCOK AIMLY, AOTYLLIEHHBIX K ynpaBneHmio (Aonyluero: 1 uen.) 0.46 Nnunaa s sk ok 11.06.1990 4016.89 py6.
for english

  1. ААС 6007896622 АО‘ОСК Действует Периодиспользования ТС активен на запрашиваемую дату Марка и модель транспортного средства Рога МОМОЕО (категория (категория ‘Х) ## Государственный моеи регистрационный знак мм учторххавееено2в Номер кузова учторххавееено2в Мощность двигателя для звеь категории В, л. нет уренек уренк се 11.06.1980 Самарская обя, г Сызрань Ограничен список лиц, допущенных к управлению (допущено: 1 чел 046 Личная уренек уренк се 11.06.1980 401689 руб.
for russian language. 

I tried to change psm, oem,dpi params. What is the best settings for this image? Please help


Lorenzo Bolzani

Feb 28, 2023, 8:22:00 AM2/28/23
Hi, try rus+eng as a language or eng+rus and see what works best. You can also use more than two languages.

Or run both languages separatedly and keep the result with the highest confidence score. You could also consider the location of the text on the page to decide.


You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit
Reply all
Reply to author
0 new messages