reviving discussions

19 views
Skip to first unread message

Nick Thieberger

unread,
Sep 17, 2013, 2:27:52 AM9/17/13
to interlinear-xml, igt-...@googlegroups.com
Dear List members,

I am writing to let you know about several initiatives that are of interest in the production, representation and linking of interlinear glossed text corpora. I also want to ask if you know of any other advances that could lead to a standard format for IGT.

First, I am running a conference in early December in Melbourne (http://paradisec.org.au/2013ParadisecToolsMethods.html) with a workshop at which formats for IGT will be discussed. 

I hope that we  will arrive at a draft standard format for IGT, which could be an XML schema that converts readily to an annotation graf format. Thomas Schmidt's Exmaralda already does a good job of converting between various formats.

Recent initiatives I have learned about include the following:

Alexandre Arkhipov's group in Moscow have been working on various ways of converting from Flex to other formats.

TypeCraft is a multi-lingual on-line database of linguistically-annotated natural language text, embedded in a collaboration and information tool.

POIO (http://media.cidles.eu/poio/)
Peter Bouda's work. The parser of the library creates an annotation graph from the files. The user may then query the annotation graph via the API of graf-python. 

Jan Strunk has also developed a system for processing Toolbox using ToolboxSearch by Taras Zakharko (https://bitbucket.org/tzakharko/toolboxsearch/) and his own
Python Toolbox library. For ELAN he uses a Python ELAN-API, that he has written. 

EOPAS (http://www.eopas.org)
Eopas is a player for IGT with media, it uploads and validates XML from Toolbox or Elan, with a media transcoder for uploaded media.

All the best,

Nick

***********************
Nick Thieberger PhD
ARC QEII Fellow
School of Languages and Linguistics
The University of Melbourne
Parkville, VIC 3010, Australia
+61 3 8344 8952
http://languages-linguistics.unimelb.edu.au/thieberger/

The Oxford Handbook of Linguistic Fieldwork
Edited by Nicholas Thieberger Available now through all good bookshops, or direct from Oxford University Press at: http://ukcatalogue.oup.com/product/9780199571888.do

http://www.linguistics.unimelb.edu.au/research/projects/greatthings.html

Director, Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC) http://paradisec.org.au

Co-Director, Resource Network for Linguistic Diversity
http://www.rnld.org

Editor, Language Documentation & Conservation Journal  
http://www.nflrc.hawaii.edu/ldc/
*********************************

Sebastian Nordhoff

unread,
Sep 19, 2013, 4:31:30 AM9/19/13
to interli...@googlegroups.com
Hi Nick,
very interesting. Is there some kind of wiki or similar where information
about these efforts, their strengths and differences can be stored? Should
we create one?
Best
Sebastian

On Tue, 17 Sep 2013 08:27:52 +0200, Nick Thieberger <th...@unimelb.edu.au>
wrote:
> ToolboxSearch by Taras Zakharko (https://bitbucket.org/**
> tzakharko/toolboxsearch/
> <https://bitbucket.org/tzakharko/toolboxsearch/>)

Nick Thieberger

unread,
Mar 3, 2016, 2:05:46 PM3/3/16
to interlinear-xml, igt-...@googlegroups.com

Hi,

I'm looking for feedback from your collective wisdom. I am approaching the issue of IGT from the perspective of archival storage of the files and of representation of the IGT in some viewer (of which EOPAS is a current example). I intend to store files in the PARADISEC collection in an IGT format (with a new datatype .ixt) that will then trigger a viewer to present the text and media together in the way that EOPAS has done until now. So the time is right for determining what format the files will be in. I appreciate all the discussion about formats that reflect a deeper structural relationship, but am thinking XML will be the way to store these files. 

Nick

Han Sloetjes

unread,
Mar 8, 2016, 6:08:23 AM3/8/16
to interli...@googlegroups.com, igt-...@googlegroups.com
Hi Nick,

I think it would be good to come to a decision for such a format and a .ixt datatype sounds fine to me.
This discussion has been quite a while ago and I would have to dig up old conversations to refresh my memory. But maybe the December 2013 conference resulted in a draft standard format? Or maybe there is even a more recent draft? I don't remember seeing such draft but if there is it would be helpful to post it her (again).

Or maybe you can briefly summarize/list the most important open issues that need a decision in order to come to a definition of a (proposed standard) IGT format?

Regards,
Han
--
You received this message because you are subscribed to the Google Groups "Interlinear XML" group.
To unsubscribe from this group and stop receiving emails from it, send an email to interlinear-x...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nick Thieberger

unread,
Nov 8, 2016, 4:44:57 PM11/8/16
to interlinear-xml, igt-...@googlegroups.com
We have developed a viewer for IGT in the PARADISEC collection. It detects media and igt and displays it in an eopas-like viewer. You can see an example here (if you are logged in to the PARADISEC catalog) http://catalog.paradisec.org.au/viewer/#/NT1/98007, click on audio and then on interlinear. 

This is javascript and is available on github (https://github.com/MLR-au/pdsc-collection-viewer/)

Nick

To unsubscribe from this group and stop receiving emails from it, send an email to interlinear-xml+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Interlinear XML" group.
To unsubscribe from this group and stop receiving emails from it, send an email to interlinear-xml+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages