- or up-to-date Windows executables (and installers), see Uni of Mannheim (Stefan Weil). google search should dig those up quickly. ("tesseract mannheim windows installer" I bet) See also: https://tesseract-ocr.github.io/tessdoc/Downloads.html
- tesseract supports multi-language OCR actions by specifying multiple languages on the command line using the `-l` command line parameter. Here's what I use at the moment:
tesseract -l eng+rus+chi_sim+chi_tra+deu+fra+spa+jpn+hin+urd+vie+osd ....etc...
which is, frankly, an almost insane combo, but that's what feeding tess in the local tests.
You can get a list of languages from tesseract (once installed) when you run it with the `--list-langs` command line parameter.
See for a leg up:
Depending on your needs, you might also want to look into looking the "generic scripts" instead of the language-specific models: this is done by, for example, specifying
tesseract -l script/Latin+script/Greek ....etc.....
tesseract -l eng+script/Greek ....etc.....
(Tip: a quick peek in your 'tessdata' language models' directory tree will show you quickly what you get distilled from `--list-langs`: I checked the `script` subdirectory to come up with the above:
which is a snapshot from Windows Explorer straight from my local development environment, so reckon that your directory tree will be located elsewhere, but those .traindata files should be available on your machine after you installed tesseract + tessdata.
Re math OCRing: sorry, can't help you there.
Haven't accomplished that myself yet, but one direction to investigate there would be to look into "legacy mode" as tesseract v3 had a dedicated "math mode" -- no idea how well that ever worked, so cave canem.
If I were you, I'ld first attack the english(latin)+greek text problem and see if I'ld get tesseract to produce something sensible for a couple of such test files.
Be advised: "optimizing" may be desired, but make sure you first get decent results and a workflow you like; tuning an OCR engine is hairy business so leave that for last, so you'll have a dependable baseline to work from and compare your changes against.
This is generic info; you might get more detailed help from the mailing list when you provide more info about your setup and what you've been trying to accomplish so far. your current info is a little "thin". ;-)