Hi All,
I just uploaded the first release of the WLC Word List,
https://github.com/openscriptures/morphhb. This is a compilation of the
word forms in the WLC, according to vowel form and augmented Strong
number. The consonantal form is included for searching and sorting
purposes. The separate references for each form contain the prefix and
suffix data for that instance, as IDs referring to the prefix and suffix
section at the beginning of the document. Full documentation is
included in WlcWordList.html. The full package is available in the
downloads.
My goal is to have a database-friendly catalog of words available for
morphological parsing. This will give the flexibility for recording
parsings, as well as using existing parsings of the same form to aid in
new assignments. The format contains all the WLC references for each
form. This will facilitate focusing on the ROI, parsing the forms that
occur most often first, to gain the greatest benefit. Some of the
parsings, like the prefixes and possibly the suffixes should be
straightforward, and apply to a great many instances. See the examples
in the documentation.
The format also provides for easily identifying discrepancies. I
already have experience with this. As I was building the list, I came
across a Strong number error in Numbers. When I corrected this, it took
a singleton form and merged it into a form with more entries. I also
found a very unusual prefix in Eccl.4.10. BDB identifies the word as
two separate words, but written as one in the MT. So in this case we
find a 'prefix' between the body of the word and the suffix. Anyone who
notices other such discrepancies, please post to the list, so
corrections can be made.
I just discovered that the WLC has been updated from 4.14 to 4.16. So
some work lies ahead. The application I used to make the list will work
again, but then any changes will be lost. I may have to do a separate
update for the WLC and the Word List.
Peace,
David