On 7/1/2009 3:44 PM, Alexandre Arkhipov wrote:
> This is to add a vote for this feature; I've just heard from my
> colleagues in St Petersburg that the two major problems restraining
> them from using FLEx are (1) no convenient way to display alternative
> analyses in text simultaneously before the homonymy is resolved, and
> (2) no convenient way to mark borrowings and their origin.
Alexandre:
On item (1) above, please tell us more about what your colleagues expect
to see or would like to see. What might they consider to be a
convenient way to see the alternatives?
Thanks,
--Andy
Thursday, July 2, 2009, 5:00:29 PM, you wrote:
Alexandre> I will forward your question to them, but I can tell you what they
Alexandre> already tried. They were entering all the alternative analyses for
Alexandre> homographs in one field separated by slashes -- producing something
Alexandre> like
Alexandre> can
Alexandre> be.able.Prs/container.Nom.Sg
Alexandre> so that they can have a printout of a very roughly analyzed text and
Alexandre> yet always have a correct analysis (although among some incorrect
Alexandre> ones) for each morpheme. But this means they have to rework most of
Alexandre> the dictionary as the text analysis progresses.
I'm not sure how practical a request this is when considering the screen
area. The current method is for people to hop through the analysis a word
at a time and select the correct sense from a drop down list. At times
there can be a dozen alternatives, and I'm not sure anyone would want all
twelve parading across the screen separated by slashes.
Or are they thinking in terms of a printed view?
--
Doug mailto:Doug_...@sil.org
Language Software Coordinator
SIL Africa Area
Oleg, I think I understand what you are asking for. What I'm still
puzzled about is the motivation behind the request. Please help me
understand why you all want this capability in the printed output. What
do you plan to do with the printed result? Maybe have someone go
through and circle the correct analysis in context? If so, after they
do this, what is the plan?
Thanks,
--Andy
> Having them all displayed at
> once for each occurrence would render the text unreadable. I'd then
> suggest
> displaying two or three most probable analyses plus a sort of an
> ellipsis
> (...) to indicate there are more of them.
This is an interesting idea.
One thing I have asked for is that the display would make some visual
distinction between words that have only one parse and those that
have more than one. That way, when I go back to the text to work on
ambiguities, I can quickly see which ones are the ambiguous ones.
The unambiguous guesses also need to be confirmed, but I would
separate that process from checking the ambiguous ones.
I hadn't been asking to have more than one analysis displayed; just
make it visually clear when a word has more than one.
For anyone who has ever worked with the CARLAstudio program, or any
of the tools behind it (AMPLE, etc.), there is a "manual
disambiguator" that takes the output of AMPLE and displays it as
text, but with all the ambiguities showing. The user then moves
through the text, and for each ambiguous one, chooses one of the
alternatives. The program then edits the underlying (SFM) database
so that now it has only that one choice. It might be illustrative to
see what that display is like. I think that one shows a max of 5
ambiguities, but makes it possible to get to all of them.
That tool is intended as a tool for choosing among the ambiguities
and recording those choices in the database, that is then used for
other things. It is about the data, not about printing, so it seems
a little different from what you guys are wanting.
-Beth
> Well, this applies not only to print view, but to analysis/gloss, too.
> Basically for each ambiguous wordform in text there should be several
> analyses at once, and these should also be exported into XML etc.
Actually, it sounds to me like you're most concerned with exporting
analyses whether they have been approved or not, is that right? It's
not about printing a document that shows them, and it doesn't even
seem to me that having them displayed in the Interlinear view is
necessarily crucial to your task. Is that correct? You just want
them accessible for a post-process that would apply to what you
export. I would think that exporting all guessed analyses would be
easier than finding a good way to display them.
As to whether the CARLA tools would be useful for this, it's true
they were designed in order to transfer texts between two languages,
but in order to get to that step, you have to be able to parse each
language. Many people have set the tools up to parse one language,
and then simply stopped there, because that gave them a lot of value,
in terms of helping them understand the language, getting better
spell-checking than one can from a list, and creating interlinear text.
If you were to use the CARLA tools to parse your texts, the result
would be an SFM database that contained all the possible parses for
each word, no limit on the number of parses. (The (separate) manual
disambiguator tool only takes that database and displays it in a
human-friendly way.) You could then convert that SFM database into
an XML format using some text-transforming tool. I don't know
whether the CARLA tools would or would not be a better solution than
FLEx--FLEx is certainly a very good way to manage your lexical
database. There is also a (not-so-trivial) way to export a FLEx
dictionary so it can be used with CARLAstudio parsing. I think there
are documents about this that ship with FLEx.
But I do agree that it would be good if FLEx could have an export
option that would export all possible guesses.
-Beth