I showed that, for example target 130528 (1 Thessalonians 5.28) gets
excluded because of one form that is #235 while the other eight forms
appear in the top 66.
Well, what if those 9 forms were learnt first? That is:
Χριστοῦ, κυρίου, Ἰησοῦ, ὑμῶν, μετά,
τοῦ, χάρις, ἡ, ἡμῶν
Not only could 130528 be read but also 071623
Now if the reader learnt πάντων (just one more form) they could
read three more verses: 140318, 191325 and 272221
Now introduce these six forms:
καί, ὑμῖν, ἀπό, εἰρήνη, πατρός, θεοῦ
and suddenly *seven* more verses are readable: 140102, 070103,
100102, 110102, 090103, 180103, 080102
This was just with one algorithm I'm experimenting with (which I'll
explain and provide code for soon) and there are likely others than do
So instead of 100 forms giving 0 verses, we now have just 16 forms
giving us 12 entire verses from an actual corpus.
The usual caveats apply: items are considered independent and equally
easy to learn, there's no consideration of morphology, syntax, idiom
and this is using verses as targets. We'll fix all that over time.