WLC to the most literal English translation possible, word-by-word

Joshua Koehler

unread,

Oct 8, 2024, 1:07:22 AM10/8/24

to Open Scriptures

Hi folks,

First of all thank you for all the work you've done thus far with OpenScriptures! It's encouraging to see others in tech exploring this all-important dataset.

I'm working on a free language learning app to help people learn languages with the Scriptures. When the L2 (target language to learn) is Hebrew, I of course wish to use the Westminster Leningrad Codex as the source. My goal is to have the most wooden, literal, word-by-word translation of that text possible. For example, Hebraisms should be translated woodenly, despite that being perhaps a more equivalent translation. Inflected translations are ideal, so rather than the generic Strongs entry, I'd like to provide the user the tense - i.e. he said, we are arising etc.

After some searching around, scripture4all.org seems to have some pdfs that are basically what I'm trying to accomplish, but they're in a difficult format to parse. I've contacted the author to see if I can obtain a more parseable, raw form of that content.

The other alternative I'm considering is to parse the morphb wlc and do the best I can with that. The downside is I'll have to somehow reconstruct proper translations from the parsing, but that's not impossible to do with some heuristic that approximates manual work.

These are my plans, but my question to the community is whether this seems the best plan for my goals, or if any know of more suitable datasets and/or a better plan to accomplish my end goal here.

Thanks,

Joshua

Robert Hunt

unread,

Dec 9, 2024, 9:34:28 AM12/9/24

to openscr...@googlegroups.com

New Zealand.

Hi Joshua and everyone,

Sorry for the slow reply. I'm working on creating the Open English Translation of the Bible (currently just at 50% of the Bible drafted), which has a Readers' Version, and more relevant to this discussion, a Literal Version side-by-side. My aims with the Literal Version sound pretty similar to yours. You can view very early samples of the OT OET-LV at pages like https://Freely-Given.org/OBD/par/EST/C2V13.htm#Top. My LV tends to be more literal than most, because I try to avoid Hebrew -> Greek -> Latin -> English versions of Hebrew names (and I use my own transliteration scheme which is aimed at non-scholars of Hebrew and making better use of modern Unicode character sets), and also because I include words not usually translated into English.

Some of the English glosses are my own, and as a translation tool I try to show ranges of meaning with things like "in/at/on", etc. Others come from Clear.Bible's Macula Hebrew, although I'm not satisfied with many of their glosses (not sure where they originated?), plus they currently have a bug where compound words (like Beyt-Lechem) are missing glosses (probably shows as "wwww" (missing word gloss) in the OET-LV OT.

Anyway, seems we have some common goals. My repo's at https://github.com/Freely-Given-org/OpenEnglishTranslation--OET.

Blessings,
Robert.
https://OpenEnglishTranslation.Bible
https://Freely-Given.org

--
You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openscripture...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openscriptures/21e75ff1-a905-47f0-b10f-ed36f3ff7eebn%40googlegroups.com.

Joshua Koehler

unread,

Dec 31, 2024, 9:27:15 AM12/31/24

to Open Scriptures

Hello Robert,

Likewise, apologies for my delayed response. In providence, a friend and I happened to discuss this project the day you responded.

I took a brief look at your work and it looks very interesting, I'm really pleased to see a project like this!

As far as I can see, there are at least two distinct parts of an open-interlinear project.

1. The translation

2. The digitization of that translation

I recently came across Jay P. Green's interlinear, and frankly it is exactly what I'm looking for if I could acquire rights to open source the data, and find an efficient way to digitize it.

I'm open to other possibilities though. I should emphasize my goals are a bit different, however. The parameter I'm seeking to optimize for is speed of language acquisition, not the also very worthy objective of Biblical study. For example, I find that interlinears with additional information such as multiple definitions of a word tend to slow down the more organic, imprecise goal of language acquisition.

But we may be able to still find a way to work together here, so I'm open to your thoughts.

Kind regards,

Joshua

Op maandag 9 december 2024 om 08:34:28 UTC-6 schreef Robert Hunt:

Robert Hunt

unread,

Jan 22, 2025, 10:55:18 AMJan 22

to openscr...@googlegroups.com, Joshua Koehler

Hmmh, very interesting. I wasn't aware of the late Jay P. Green. (I'm leaving instructions for all my work to go into the Public Domain on my death -- it's open licensed at present -- but I guess the copyrights of his work are all owned by his publishers anyway. I was challenged when the latest ISV work sadly seemed to become lost after the death of its instigator and then the domain name eventually expired.)

Yes, the Open English Translation of the Bible - Literal Version (OET-LV) isn't designed to be read as such, but as a reference translation. It's main purpose in life is to help show the Bible reader the decisions that were already made by the translators in choosing words. (Many readers don't realise that even an interlinear already has many interpretive decisions in mapping/glossing words from one language to another (e.g., some substitute temple or palace for house in some places), as well as in both adding and removing words like articles.) But yes, for your purpose, something different is probably better (and my own interlinear pages seem to be formatting wrongly at present).

This week I was able to rebase on the Macula Hebrew 'nodes' XML files to work-around the missing glosses on compound words in their 'low fat', so the generated OET-LV should be better now, although many of their glosses are not literal enough yet for my purposes (but my own more literal glossing and word reordering will have to wait until next year until after I get the first draft of the OET-RV completed -- now at 54%).

Blessings,
Robert.
https://OpenEnglishTranslation.Bible

Robert Hunt

unread,

Jan 22, 2025, 10:55:30 AMJan 22

to openscr...@googlegroups.com, Joshua Koehler

Sorry, after my previous email and now after reading Jay P. Green's strongly-worded preface to his interlinear, I don't think it's something that I would like to appear to give any support to. Have you read that preface, Joshua?

I guess since I'm writing again, I should also point out for language acquisition, a lot of English words have changed in meaning between the early English translations and now. Sadly, however, many of those words have been carried through for various reasons into our 'modern' translations. I've started to write some of these up briefly at https://OpenEnglishTranslation.Bible/Discussion/WordEssays, plus Mark Ward's video series on 'False Friends' also mentions a number from the KJB era. Just saying: even many of our so-called 'modern' English Bible tools contain a lot of baggage from previous centuries.

Robert.

On 26/12/24 06:58, Joshua Koehler wrote:

To view this discussion visit https://groups.google.com/d/msgid/openscriptures/f9946397-9078-4814-aaf7-615ce7a4edcbn%40googlegroups.com.

Joshua Koehler

unread,

Mar 4, 2025, 11:11:23 AMMar 4

to Open Scriptures

Dear Robert,

I owe you a large apology for my delay in replying - I was clearing out my unread emails and found your thoughtful and valuable one among them.

No, I had not read that preface. I read the whole preface this afternoon (excepting what latter portion archive.org doesn't include in the preview) and found it strongly-worded indeed. While we can charitably admire what appears to be the deep conviction of the author, I join with you in finding it a style I wouldn't enthusiastically support. That being said, I'm not necessarily opposed to using it if I find it to be the most valuable dataset, despite any potential reservations regarding the author. I've actually found Youngs Literal to take an approach even more suitable for my purposes. When Green's and Young's differ, Young tends to take a more wooden, literal approach, which as I stated above is what I'm looking for right now. However, to my knowledge Young did not produce an interlinear. Blue Letter Bible also has an interlinear on their site, however it is less literal (more closely following the KJV translation itself) than Green's.

Thank you for sharing the video on False Friends, I've likewise observed their existence in the AV, and have naturally looked up archaic definitions in the presence of contextual dissonance. However, the clear terminology and resources Dr. Ward introduces in the video are invaluable. I found the OED an especially useful resource that I've somewhat half-consciously grasped for in vain over the last few years.

False friends are indeed brambles in the path of language acquisition, but I believe at this point they're less significant than other hurdles (non-wooden translations that mask etymology). Obviously my goal would be to remove them as much as possible in the translation I use, but it's likely less of a near-term goal at this point.

I also took a look at your Word Essays and found them an interesting read.

Coming back to the question of interleaving my goals with the OET, I wouldn't rule it out at this point. However, I'm currently pursuing a different initial approach to launch the tool. There will likely be a lot of experimentation, and I have a lot to learn regarding what is out there, so our paths may cross again.

With sincere thanks for all your help,

Joshua

Op woensdag 22 januari 2025 om 09:55:30 UTC-6 schreef Robert Hunt:

Reply all

Reply to author

Forward