Dear Garima,
Thank you for your interest in CAT and Tibetan. I'll answer your questions below.
1. The short answer is that this is a good idea, but not possible in the context of Tibetan.
The long answer is that the problem is not really linked with the Tibetan words themselves. The problem has to do with the parsing of a text into words. For English, the general idea is that spaces delimitate words. There is some fancy things that can be done to improve that over simplified algo, but there is nothing much complicated.
In Tibetan, if I have a sentence as simple as "The cat ate the mouse", here is what we get: "ཞི་མིས་ཙི་ཙི་ཟས་སོང་།".
The first thing you notice is that there is no space at all, so any strategy based on the spaces and improving this first segmentation is not applicable. Yet, you will have observed that Tibetans invented the dot to separate syllables, which is an improvement from the ancient sanskrit that delimitated neither syllables nor words, yet it remains a challenge.
Let me now do a gloss of the sentence up there:
"ཞི་མི ས་ ཙི་ཙི་ ཟས་ སོང་ །"
cat SUBJ mouse ate PAST"
As you see, one of the tricks of Tibetan language is to merge the subject case marker into the last syllable of a word. ཞི་མི་གིས་ becomes ཞི་མིས་, so when I split into words, I can't just take a list of words and see if I get matches or not. I need to reconstruct the lemma from the inflected form. It is similar to Sandhi as found in Sanskrit.
What all this means to us is that unless you build a rather complex word parser, you can't find word boundaries. Sanskrit has that exact same problem and up to now, as you will see
here, parsing one verse of sanskrit yields thousands of possible parsings. In the link, they are all the possible combinations that can be obtained by making different choices. The choice of the developper behind this Sanskrit parser is to never make a choice, but present the human reader all the possibilities and let him choose.
On the other hand, my strategy in
botok was to use basic heuristics to choose the most probable parsing. So it gives a reading that is not perfect, but something plausible nonetheless. That can be used as a basis for improving the support of Tibetan by CAT tools.
In short, helping support Tibetan language in CAT tools such as OmegaT would be to find a way of having something like botok to pre-process the input text.
Making something that integrates with OmegaT, ideally in their own code-base, would be the best. You will see OmegaT uses regexes that can be modified to parse texts into sentences and then in words. If we could have something within OmegaT that allowed to pre-process Tibetan text with a parser like botok, it would be marvelous.
2.
About the idea of having multiple sources for a translation, I think the best approach is to keep one main source that will be the main one, but have a menu where many variants can be consulted and have something like checkboxes or something that enables to say that for a specific word/sentence, one version is followed.
No question is dumb. It only shows that something in what is written in the ideas is not clear, so thanks for asking!
As for the BDRC project, the indicated person is Élie, who you can reach directly in Slack. (I believe you know how to go there, otherwise, please tell me)
I hope this answers your questions. I'll also post this answer about the CAT tools in the Slack channel so others can benefit from it as well.