Online Linguistic Database rough draft

8 views
Skip to first unread message

Joel

unread,
Feb 27, 2010, 3:17:37 AM2/27/10
to Online Linguistic Database
Hey All,

An incomplete version of the OLD is now online.

URL: http://jdunham.webfactional.com/

Currently you can view the Home, People, About and Help pages without
being logged in. Add Form and Search Form, however, require
authentication.

Email me if you want a username/password to explore the authentication-
required parts of the system.

The Add/Search File/Collection scripts have not been written, so
clicking on those links will give you a 404 Not Found error.

Feel free to experiment with the system, adding Forms, searching for
Forms, adding Speakers, etc. JUST DON'T ADD ANY REAL/SENSITIVE DATA
-- THIS IS JUST A DEMO.

Any feedback will be truly appreciated.

Thanks,

p.s., the formatting/visuals are definitely works in progress and can
be easily modified
Joel

Joel

unread,
Mar 10, 2010, 1:46:11 AM3/10/10
to Online Linguistic Database
Hey Everyone,

Some more news.

A. The OLD has been updated.

i. Those of you who found bugs when using Internet Explorer should
find them fixed now. I tested this with IE8 but I don't have IE7, so
if someone is using IE7 could they make sure that they can add Forms
and that the "+" buttons on the Add Form and Search Form pages work.
(The site does not work with IE6, but that, in my humble opinion, is
IE6's problem.)

ii. I have tested the OLD without issue in:
- Firefox 3.6 (on Mac)
- Firefox 3.6.2pre (a.k.a. "Namoroka" on Ubuntu)
- Safari 4.04 (on Mac)
- Chrome 5.0.307.11 beta (on Ubuntu)
- Opera 10.10 (on Ubuntu)
Please let me know if you find bugs on these or different browsers.

B. Katie Sardinha emailed me some comments which I think may
stimulate some discussion, so I'm going to quote her here ("""...""")
and reply to her comments. Don't hesitate to give your two cents.

"""
1. Regarding the tags which may be applied to various forms: when a
researcher is associating tags with forms, will they do this through a
drop-down menu or "check the box" process, so they can see all the
already-available tags? A "check the box" interface, where all the
already-available tags show up in a window and a user checks off the
ones he or he wants to associate with a specific form, seems
especially simple to use (obviously easier to use than entering tags
by hand for each form) but maybe more difficult to implement. What
system is the BLD using?
"""

Short answer. Both dropdown menu (a.k.a. select box) and checkboxes
are used. Dropdowns are used when only one option is available and
checkboxes when more than one are available. Login and go to
http://jdunham.webfactional.com/form/add to see.

Long answer. This brings up a fundamental decision that I made about
the OLD early on. Namely that the OLD is a corpus and not a
dictionary. That is, it is fundamentally a database of elicitation
events and not a database of, say, words abstracted from the occasions
on which they were uttered. This means that each OLD Form represents
the information gathered at an elicitation event in which there was
one speaker, one elicitor, one date of elicitation and one elicitation
method. If researcher A elicits the a word or sentence which is
identical in all linguistic respects (same form, morphology, category,
meaning) as one entered into the OLD by researcher B, then A must
decide for herself whether or not to add another Form that is
identical to B's linguistically but which might differ in terms of
properties of the event of elicitation. Usually it will be in A's
best interest to add the near-duplicate Form in order to accurately
document her elicitation or story or whatever. Of course, OLD Forms
are can also be morphemes and these are almost always abstractions
from events of their use. When entering morphemes, researchers should
avoid entering event-specific information like date of elicitaiton,
elicitor, speaker, etc. I think we will ultimately have to create a
feature where one can just click a button in order to view all the
Forms that contain a particular morpheme (or word).

So, different tags have different methods for association to Forms. A
Form can have zero or more keyword tags, so checkboxes are used. But
a Form (being a recording of an event) can only have one speaker, one
elicitor, one elicitation method and one category, so html selectboxes
are used.


"""
2. Another point regarding the tagging system: Last summer I worked
on a tagging system for an online document library concerning Inuit
language materials. One of the major difficulties to deal with was
how to make tags that were not too broad, the problem being that some
tags (e.g. "revitalization") could be applied to most of the documents
in the library and thus weren't very informative, while others were
very specific and perhaps "too" informative. We decided to design the
system using nested categories (where lower level tags bear a "is a
kind of" relationship to higher-level tags). For example, if the
following tags were available:

Community Programs
a. Language Nest
b. Master-Apprentice
c. On the Land

we could tag a given document about Master-Apprentice programs with
the tag "Master-Apprentice", and the higher-level tag "Community
Programs" would automatically become associated with the form as well.
If the document was about community language programs more generally,
we could choose to just use only the higher-level tag. This format
had its advantages, such as making it easy to associate higher-level
tags (which may associate with a large number of forms). It had its
disadvantages too - some terms seemed classifiable under multiple
headings, and I think the system might be harder to change.

I wanted to bring this model up just to get thinking about it (and
alternatives!) in the context of working on an informative tagging
system for the OLD. I'd be interested in helping out with this stuff,
I find it pretty interesting.
"""

The idea of a hierarchy of tags is intriguing, though it may be more
trouble to implement than it is worth. I'd like to know whether other
people would find this a useful feature and also some concrete
elicitation-documentation examples to illustrate.

In my experience with the Blackfoot Language Database (BLD), the
keyword tags (which are basically a general-purpose tag) were not
extensively used. To a certain extent the functionality of a tag
hierarchy can be achieved with disjunctive searches. For example, in
the BLD we had morpho-syntactic category tags for different types of
verbs (vai, vii, vta, vti, and vrt). If I wanted to search for a
pattern in the verbs I could just use a disjunctive regular expression
search "(vai|vii|vti|vta|vrt)" or the "any of x, y, or z" search
restrictor functionality currently built into the OLD.

I'd like to hear whether others think the ability to organize tags
into (arbitrarily complex?) hierarchical structures would be a useful
feature for the OLD. And whether such a feature would provide
functionality beyond what sophisticated search options could provide.


"""
3. It is not obvious to me what the "ID" field refers to (i.e. the ID
field located on the "Tags" page under Keywords, Syntactic Categories,
etc.)? Will it be necessary for people working in the OLD to know and
use this field, or is it for more for administrative/organizational
purposes?
"""

Every entity in the OLD has an integer ID that is unique within its
class. I.e., every Form has an ID that it shares with no other Form.
Users cannot change the ID. This means you can change every detail of
a Form and the system will still recognize it as an updated version of
its previous self. You can see this in action by updating a Form and
then clicking on the "History" button beneath it: the system will show
you all previous versions in reverse chronological order.

The ID of a particular entity has further uses. You can use it to
unambiguously refer to a Form (or File, or Collection, or User, or
Keyword, etc.). You can also use it to quickly view a particular
entity using the address bar of your browser. For example, entering
http://jdunham.webfactional.com/form/view/2 will give you the Form
with ID=2.


"""
4. I think this may already be a feature of the BLD - but I think its
important for people working in the OLD to be able to tag forms they
find personally relevant, without these tags being visible to other
OLD users.
"""

I'm not sure I agree that this is a good feature. But maybe I am just
daunted by the prospect of implementing it. This would involve making
tags user-dependently invisible on the Tags page as well as on the
Search Forms page and the Form View pages.

Besides the implementation challenge, I don't really see the purpose
of this in a collaborative environment. As it stands, only OLD
Administrators and Researchers can add, view or search for Forms.
When Learner (another type of OLD User) features are implemented, a
restricted view of Form data will definitely be implemented and this
could involve making all or certain subsets of various tags
invisible. But making Researcher A's tags invisible to all but A -- I
don't really see the point. If somebody feels strongly about this, I
invite you to make a case for it.


"""
1) When people are adding forms, they may at that time realize that
they need to add a new "Keyword", "Elicitation Method", or "Source"
which is relevant to their form but doesn't already exist (presumably
this will be more of a problem in the early stages of the OLD). They
would have three options at this point: a) they could go back to the
page where they could add new Keywords, Elicitation Methods, or
Sources and add them. Unfortunately this would disrupt the form they
were trying to add (assuming they would at least have to refresh the
page they were working on to have the addition show up). One way to
make this work would be to allow people to "save their progress" while
in the process of adding a new form (or collection, etc.). b) They
could finish filling in the form and after it's saved, go and add the
new Keyword, Elicitation Method, or Source. They would then have to
go back and edit the already-saved form so as to associate it with the
new addition. This would work but would be time-consuming. c) People
could add new Keywords, Elicitation Methods, or Sources directly from
the page where they add forms, collections, etc. The problem I can
see with this option is that it could lead to an inflation of new
keywords which are very specific (or who knows, it may not). It might
be nice to allow people to add new "Sources" in this way, however, as
there can never be too many of those.
"""

This is a good observation. My recommendation is that in our
documentation for users we describe the following procedure for
solving this problem:

i. Add the Form without the as-yet-non-existent tag
ii. Open the Tags page in a new browser tab (right-click or hold
CTRL and left-click) and add the tag you need
iii. Go back to your original browser tab, click the "Update"
button under the Form and add the tag

The great thing about a web app is that tabbed browsing allows an
unlimited number of OLD windows. In a new tab all the "state" will be
retained, i.e., you will still be logged in and using tab x to
interact with the system will not affect tab y. In fact, I often have
multiple OLD tabs open when adding Forms. This is useful in
situations where you want to cross-reference one Form from another via
its ID.

In short, I think the OLD + procedure (i-iii) is more than adequate to
handle the challenge raised.

"""
In regards to the Categories (under Secondary Data): it might be
helpful to have a visual way for people to reference what they mean in
case they forget or there is ambiguity (i.e. what does the category
'D' stand for again? Determiner? Demonstrative?). Perhaps there
could be a link next to this field that would open a new tab in the
browser where the Categories would be displayed next to their meanings
(so that the current "add form" page wouldn't be disrupted).
"""

This is probably a good idea. At the very least, the names of the
(synactic) categories could be links to specific locations on the Tags
page and then users could open up the page in a new tab. Or I could
put the tag's description in the "title" attribute of the tag-name-as-
link -- this results in the effect you see when you let the mouse
hover over one of the menu buttons. With the tags in selectboxes, the
above options aren't available so a link that opens a small "window"
for referencing tags and their descriptions might be the way to go.


Ok, great comments Katie. Thanks so much. Please add more comments
and we'll keep this discussion going!

Joel

Reply all
Reply to author
Forward
0 new messages