Interlinear Gloss Text in LIFT

32 views
Skip to first unread message

Svetlana Tchistiakova

unread,
Jun 7, 2012, 1:52:59 PM6/7/12
to LexiconInterchangeFormat, br...@linguistlist.org
Dear LIFTers,

The LEGO project, which uses a restricted version of LIFT that still
validates against the official LIFT schema, has recently encountered
the issue of encoding interlinear gloss text (IGT). I was wondering if
anyone had come up with a way to do this consistently, given the
current version of LIFT. Having looked at the existing structure of
LIFT, we were unable to find a set of elements within the LIFT
framework that would accurately encode this type of information, but
we are open to any suggestions.

If not, there is a paper by Bow, Baden, and Bird (2003) of E-MELD,
proposing an XML standard: http://emeld.org/workshop/2003/bowbadenbird-paper.html
Are there plans to implement something comparable into LIFT, or does
something already exist in the works that you would recommend we
implement?

To provide an example, here is a portion of the original XML from a
trilingual (native language-English-French) lexicon with which the
LEGO project is currently working. The elements included here are <lx/
> which holds the native-language entry word, <t/> which holds tonal
information, <c/> which holds grammatical information, <d/> and <dfr/>
which hold the English and French definitions of the word
respectively, and <eGroup/> which holds an example of usage in the
native language with translation into English and French.

The IGT itself is housed in <lGroup/> with <l/> holding the native
language morphemic breakdown, <lg/> holding the corresponding
morphemic gloss in English, and <lgfr/> holding the corresponding
morphemic gloss in French.

<lxGroup>
<lx>a´la´ko´</lx>
<t>H(L)</t>
<cGroup>
<c>Conj</c>
<dGroup>
<d>so long as; let's hope</d>
<dfr>pourvu que; en autant que</dfr>
<eGroup>
<e>I´ i´ se´ ta´ala`, a´la´ko´ i´ i´ ya´a´ki´i´ to´ i´fa
´ k?´?´ni`n?o´ l?`.</e>
<g>You can go, so long as you take good care of your old
father.</g>
<gfr>Tu peux partir, pourvu que tu continues de prendre
bon soin de ton vieux pcre.</gfr>
</eGroup>
<lGroup>
<l>a´la´-ko´</l>
<lg>God-say</lg>
<lgfr>Dieu-dire</lgfr>
</lGroup>
</dGroup>
</cGroup>
</lxGroup>


Under Bow, Baden, and Bird's proposal, with the addition of the
current LIFT usage of <form/>, the above example would look as
follows:

<interlinear-text>
<item type="title">
<form lang="eng">
<text>morphemic representation</text>
</form>
</item>
<phrases>
<phrase>
<item type="gls">
<form lang="eng">
<text>so long as; let's hope</text>
</form>
</item>
<item type="gls">
<form lang="fra">
<text>pourvu que; en autant que</text>
</form>
</item>
<words>
<word>
<item type="txt">
<form lang="xxx">
<text>a´la´ko´</text>
</form>
</item>
<morphemes>
<morph>
<item type="txt">
<form lang="xxx">
<text>a´la´</text>
</form>
</item>
<item type="gls">
<form lang="eng">
<text>God</text>
</form>
</item>
<item type="gls">
<form lang="fra">
<text>Dieu</text>
</form>
</item>
</morph>
<morph>
<item type="txt">
<form lang="xxx">
<text>ko´</text>
</form>
</item>
<item type="gls">
<form lang="eng">
<text>say</text>
</form>
</item>
<item type="gls">
<form lang="fra">
<text>dire</text>
</form>
</item>
</morph>
</morphemes>
</word>
</words>
</phrase>
</phrases>
</interlinear-text>

Thank you ahead of time for your advice!

Svetlana Tchistiakova
The LEGO Project
LINGUIST List

John Hatton

unread,
Jun 8, 2012, 12:17:04 AM6/8/12
to lexiconinter...@googlegroups.com, br...@linguistlist.org
Hi Svetlana,
FieldWorks Language Explorer (FLEx) has implemented something based on that
paper, and FLEx imports and exports it, and SayMore exports it (from
SayMore's native re-use of a restricted ELAN format), while XLingPaper
imports it.

I don't know how much documentation the FLEx development team has done on
exactly what they've implemented. At my suggestion that ".xml" wasn't
helpful to users, they helpfully now use the extension ".FLExText", though
I'm sure we'd all welcome a product-neutral name for the format.

From my perspective (I'm no longer part of the FLEx team, so I don't speak
for them), it would be great if the LEGO team work with the FLEx team to
further that format.


John Hatton
SIL International Language Software Development, PALASO, and SIL Papua New
Guinea



Svetlana Tchistiakova

unread,
Jun 11, 2012, 10:26:42 AM6/11/12
to LexiconInterchangeFormat
Hi John (and other LIFTers),

Thanks so much for your response! I took a look at FLEx and it looks
like when IGT is exported on its own, FLEx exports it into a format
based off the Bow, Baden, Bird proposal. When exporting the lexicon
into LIFT, it looks like FLEx creates separate <entry/> elements for
each root/affix of the IGT in order to be able to conform to valid
LIFT. Unfortunately, this method would contradict the original
authorial intent of the trilingual lexicon we're working on, so we'd
prefer to find an alternative method.

Does anyone know if there are plans to implement IGT within the
headword's entry in LIFT using either the Bow, Baden, Bird proposal or
a different method? If there are no plans to implement this now, have
others included IGT within LIFT <entry/> elements in a different
manner?

Best,

Svetlana
Reply all
Reply to author
Forward
0 new messages