Dear tesseract developers,
from my project "Bunte Bilder aus dem Sachsenlande" (see german blog articles:
http://art1pirat.blogspot.de/search/label/tesseract) I have created a fully corrected wordlist for tesseract. The wordlist comes with the long-s coded words and should be used to train deu-frak. Also, my wordlist is part of the "pre1900" list of the german TeX-project "Trennmuster"
http://projekte.dante.de/Trennmuster, the source could be used to produce very large wordlists for all german epochs from 1830 until now. See
http://projekte.dante.de/Trennmuster/RepoHaupt for details.
Because I have also trained the OCR with 'long-s', please let me know if you want my training material.
At least I thank you for your great job,
With best regards
Andreas