scott farrar
unread,Aug 26, 2009, 1:14:58 PM8/26/09Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to el...@googlegroups.com
Nothing like a little vapor-ware planning, reactions welcome. Think Django+extjs+ELTK. Here are some topics listed below:
1. Design issues
2. Common themes
3. Use cases
1. Design issues
=========
Data sets live in the cloud, both as RDF files (linked data) hosted remotely (perhaps on the user's site) but also by the server(s) in DBs for access efficiency.
Include a strong Web services component, though this is perhaps orthogonal to the actual Web apps.
2. Common themes
=========
A common theme across all use cases is the ability to hover, click-on and drag/drop GOLD and COPE cataegories.
A common theme is to diplay citation and data provenance (only for legacy data) e.g., like in ODIN where the origainl PDF is always retrievalbe.
Pull down menus should be used for categories w. many members.
3. Use Cases
========
Use Case (Creation):
-------------
Various sorts of linguistics data can be created using an editor. Something like Google Docs where multiple users can edit the same data. This should have an upload function that uses ELTK's readers, e.g., for uploading XML LIFT, csv files for sign/word lists. Users should be able to create simple text files off-line and then upload when they have a net connection.
Fundamental data types should be enumerated and editied in a common way: lexicon, paradigm, wordlist, phonetic inventory, IGT, termset, COPE, etc.
Use Case (Visualization):
---------------
Data can be viewed along a number dimensions. E.g., a lexicon can be viewed according to various sorting algorithms (by stem, by root, by feature); IGT can be viewed as a lexicon.
Data sets can be viewed for basic stats: termsets used, theory used, number of indiv. datums contain in set,
A list of what docs the user owns should be displayed (like Google Doc's main menu)
Use Case (Search/query):
---------------
Search should not only be according to string, but according to (1) concept, (2) term used for a particular concept, (3) example (show me more stuff like X .
Use Case (Manipulation):
---------------
Users are able to process data, like sorting, changing annotation terminology (e.g., global change 'PST' for 'past'), and validation (see next use case).
Use Case (Validation):
--------------
Users can check to see if uploaded files are well-formed syntactically, but also if data are semantically valid (according to ontology). So, for instance, highlight as potential error if ACC and NOM are marked on the same morpheme?
Validation includes spell checking of glosses.