microformat for igt

1 view
Skip to first unread message

Robert Forkel

unread,
Jul 28, 2009, 6:27:39 AM7/28/09
to eltk
not sure whether this topic actually belongs here, but anyway: i'm in
the process of adding examples (mainly IGT) to http://wals.info/ the
data i'm faced with can with reasonable effort be massaged into some
kind of html, but getting something along the lines of GOLD would
probably - at least now - be a lot more work. this got me thinking if
not an xhtml-based microformat [1] may be a useful thing to have (plus
a suitable reader for this format in eltk?). it could probably as
simple as

<table class="IGT">
<tr class="phrase"><td class="morpheme"> ...
<tr class="gloss"><td> ...
<caption class="translation" xml:lang="en">
</table>

what do you think?
regards,
robert

[1] http://microformat.org/

scott farrar

unread,
Jul 29, 2009, 1:01:30 AM7/29/09
to el...@googlegroups.com
Hi Robert

Yes, I think this is a good. Let me know, and my team will write the xhtml reader for you. Is there any way to know what the various authors mean by their glosses?


<tr class="gloss"><td> ...

My solution for now is to create a supplementary file, such as the following:

#This is the Leipzig Glossing Rules termset from the abbreviations found here:
#
#http://www.eva.mpg.de/lingua/resources/glossing-rules.php
#
#abbreviation, term name, gold entity, (optional comment)
#----------------------------------
1,       first person,  FirstPerson
2,      second person,  SecondPerson
3,       third person,  ThirdPerson
A,       agent, agent, agent-like argument of canonical transitive verb
ABL,     ablative, AblativeCase
ABS,     absolutive, AbsolutiveCase
ACC,     accusative, AccusativeCase
ADJ,     adjective, Adjectival
ADV,     adverbial, Adverbial
....


I then read this file into my RDF framework, and use it to process any new IGT, like your proposed XHTML.



scott

Robert Forkel

unread,
Jul 29, 2009, 2:12:03 AM7/29/09
to el...@googlegroups.com
On Wed, Jul 29, 2009 at 7:01 AM, scott farrar<sofa...@gmail.com> wrote:
> Hi Robert
>
> Yes, I think this is a good. Let me know, and my team will write the xhtml
> reader for you. Is there any way to know what the various authors mean by
> their glosses?

as far as i have understood, the notation used in the glosses is just
that: notation - without much hope of a complete termset, in
particular when working with many different languages.

another reason i was thinking about xhtml as stepping stone to GOLD
was that there is some HTML used within the gloss, e.g. <b>, <i>,
underlines and such. i'm not positively sure, but i think not all of
this could be covered by unicode glyphs. in any case, it would be nice
of the reader could take care of this as well :)
Reply all
Reply to author
Forward
0 new messages