Several questions about range and field

0 views
Skip to first unread message

TrnsltLife

unread,
Apr 15, 2009, 9:25:20 PM4/15/09
to LexiconInterchangeFormat
I'm working on a script to convert the data from WordNet 3.0
(wordnet.princeton.edu) into LIFT format. The basics are coming along
fine. I've got my entries, senses, definitions, and relations between
words working. I can load the file into Lexique Pro's 3.0 (beta) and
it works great even with its 156,563 words. I can use the lexical and
semantic relation links to jump from one word to another in the blink
of an eye, which is great. Here's a sample file with just the word
"tan":

<?xml version="1.0"?>
<lift version="0.12">
<entry id="eng:wn3:tan:n">
<lexical-unit><form lang="eng"><text>tan</text></form></lexical-unit>
<sense id="eng:wn3:tan:n:1">
<grammatical-info value="noun"/>
<definition><form lang="eng"><text>a browning of the skin resulting
from exposure to the rays of the sun</text></form></definition>
<relation type="synonym" ref="eng:wn3:suntan:n:1"/>
<relation type="synonym" ref="eng:wn3:sunburn:n:1"/>
<relation type="synonym" ref="eng:wn3:burn:n:2"/>
<relation type="derivationally-related" ref="eng:wn3:tan:v:2"/>
<relation type="hypernym" ref="eng:wn3:hyperpigmentation:n:1"/>
</sense>
<sense id="eng:wn3:tan:n:2">
<grammatical-info value="noun"/>
<definition><form lang="eng"><text>a light brown the color of topaz</
text></form></definition>
<relation type="synonym" ref="eng:wn3:topaz:n:3"/>
<relation type="hypernym" ref="eng:wn3:light brown:n:1"/>
</sense>
<sense id="eng:wn3:tan:n:3">
<grammatical-info value="noun"/>
<definition><form lang="eng"><text>ratio of the opposite to the
adjacent side of a right-angled triangle</text></form></definition>
<relation type="synonym" ref="eng:wn3:tangent:n:2"/>
<relation type="hypernym" ref="eng:wn3:trigonometric function:n:1"/>
</sense>
</entry>
<entry id="eng:wn3:tan:v">
<lexical-unit><form lang="eng"><text>tan</text></form></lexical-unit>
<sense id="eng:wn3:tan:v:1">
<grammatical-info value="verb"/>
<definition><form lang="eng"><text>treat skins and hides with tannic
acid so as to convert them into leather</text></form></definition>
<relation type="derivationally-related" ref="eng:wn3:tanner:n:2"/>
<relation type="derivationally-related" ref="eng:wn3:tanning:n:3"/>
<relation type="derivationally-related" ref="eng:wn3:tannery:n:1"/>
<relation type="hypernym" ref="eng:wn3:convert:v:2"/>
<relation type="hyponym" ref="eng:wn3:bark:v:5"/>
</sense>
<sense id="eng:wn3:tan:v:2">
<grammatical-info value="verb"/>
<definition><form lang="eng"><text>get a tan, from wind or sun</text></
form></definition>
<relation type="synonym" ref="eng:wn3:bronze:v:2"/>
<relation type="derivationally-related" ref="eng:wn3:tan:n:1"/>
<relation type="derivationally-related" ref="eng:wn3:tanning:n:1"/>
<relation type="hypernym" ref="eng:wn3:discolor:v:3"/>
<relation type="hyponym" ref="eng:wn3:suntan:v:1"/>
</sense>
</entry>
<entry id="eng:wn3:tan:a">
<lexical-unit><form lang="eng"><text>tan</text></form></lexical-unit>
<sense id="eng:wn3:tan:a:1">
<grammatical-info value="adjective"/>
<definition><form lang="eng"><text>of a light yellowish-brown color</
text></form></definition>
<relation type="similar-to" ref="eng:wn3:chromatic:a:3"/>
</sense>
</entry>
</lift>



However, there is some data in WordNet that I don't yet know how to
represent in LIFT, partly because I can't find (or haven't recognized)
an example of how to use custom fields and ranges.

1. Part of WordNet's data is a list of verbal subcategories for each
verb sense. I want to encode these inside my <sense> tags but I don't
know what the best way to do it is (this could be classed as a
semantic or a grammatical topic, but either way I don't know what tag
should be used). I think I should define a range to list the possible
subcategory values, and so I should have code like this in a <header></
header> at the top of my LIFT file:

<range id="wn3:subcat">
<range-element id="1"><label><form lang="eng"><text>Something ----s</
text></form></label></range-element>
<range-element id="2"><label><form lang="eng"><text>Somebody ----s</
text></form></label></range-element>
...
<range-element id="34"><label><form lang="eng"><text>It ----s that
CLAUSE</text></form></label></range-element>
<range-element id="35"><label><form lang="eng"><text>Something ----s
INFINITIVE</text></form></label></range-element>
</range>

Then inside my <sense></sense> tags, I should have some kind of tag
with a name="wn3:subcat" and value="1"...value="35", depending. But I
will need multiple of these tags per sense. Can anyone give me a
suggestion as to what tag I should be using for this?

2. In a similar vein, WordNet has some very minimal "semantic
domains", which I will define in a range also.

<range id="wn3:semanticdomain">
<range-element id="00"><label><form lang="eng"><text>adj.all</text></
form></label><description><form lang="en"><text>all adjective
clusters</text></form></description></range-element>
<range-element id="01"><label><form lang="eng"><text>adj.pert</text></
form></label><description><form lang="en"><text>relational adjectives
(pertainyms)</text></form></description></range-element>
<range-element id="02"><label><form lang="eng"><text>adv.all</text></
form></label><description><form lang="en"><text>all adverbs</text></
form></description></range-element>
...
</range>

I want to name the range wn3:semanticdomain, and not use the default
"semantic_domain" range, in case I want to use that for a more full-
fledged set of semantic domains sometime later. How should I reference
this range in my <sense></sense> area? Is it just the same idea as
answer 1?

3. I understand how to define a field in the header(I think), but can
someone give me an example of how to use a field in the "body" of the
LIFT file?

4. Can a field make reference to a range? And can there be multiple
fields of the same type (e.g. could I use fields to list the multiple
verb subcategories per sense that I asked about in question 1?

I know this is a lot of questions, but I hope someone can help me
understand these things a little better.

Thanks,

Jeremy

Martin Hosken

unread,
Apr 30, 2009, 5:54:42 PM4/30/09
to LexiconInter...@googlegroups.com
Dear Jeremy,

Sorry for being slow on moderating your message in. I think things should be fine now.

> However, there is some data in WordNet that I don't yet know how to
> represent in LIFT, partly because I can't find (or haven't recognized)
> an example of how to use custom fields and ranges.
>
> 1. Part of WordNet's data is a list of verbal subcategories for each
> verb sense. I want to encode these inside my <sense> tags but I don't
> know what the best way to do it is (this could be classed as a
> semantic or a grammatical topic, but either way I don't know what tag
> should be used). I think I should define a range to list the possible
> subcategory values, and so I should have code like this in a <header></
> header> at the top of my LIFT file:
>
> <range id="wn3:subcat">
> <range-element id="1"><label><form lang="eng"><text>Something ----s</
> text></form></label></range-element>
> <range-element id="2"><label><form lang="eng"><text>Somebody ----s</
> text></form></label></range-element>
> ...
> <range-element id="34"><label><form lang="eng"><text>It ----s that
> CLAUSE</text></form></label></range-element>
> <range-element id="35"><label><form lang="eng"><text>Something ----s
> INFINITIVE</text></form></label></range-element>
> </range>

I would use a trait for this:

<sense>
<trait id="wn3:subcat" value="1/>
</sense>

>
> Then inside my <sense></sense> tags, I should have some kind of tag
> with a name="wn3:subcat" and value="1"...value="35", depending. But I
> will need multiple of these tags per sense. Can anyone give me a
> suggestion as to what tag I should be using for this?
>
> 2. In a similar vein, WordNet has some very minimal "semantic
> domains", which I will define in a range also.
>
> <range id="wn3:semanticdomain">
> <range-element id="00"><label><form lang="eng"><text>adj.all</text></
> form></label><description><form lang="en"><text>all adjective
> clusters</text></form></description></range-element>
> <range-element id="01"><label><form lang="eng"><text>adj.pert</text></
> form></label><description><form lang="en"><text>relational adjectives
> (pertainyms)</text></form></description></range-element>
> <range-element id="02"><label><form lang="eng"><text>adv.all</text></
> form></label><description><form lang="en"><text>all adverbs</text></
> form></description></range-element>
> ...
> </range>
>
> I want to name the range wn3:semanticdomain, and not use the default
> "semantic_domain" range, in case I want to use that for a more full-
> fledged set of semantic domains sometime later. How should I reference
> this range in my <sense></sense> area? Is it just the same idea as
> answer 1?

Again, use a trait:

<trait id="wn3:semanticdomain" value="01"/>

> 3. I understand how to define a field in the header(I think), but can
> someone give me an example of how to use a field in the "body" of the
> LIFT file?

a field is a multitext so it takes some forms, it can also have traits, other fields and annotations within it.

> 4. Can a field make reference to a range? And can there be multiple
> fields of the same type (e.g. could I use fields to list the multiple
> verb subcategories per sense that I asked about in question 1?

You could, but a far far far better way is to use a trait. You can do anything with a field, but fields are such that then nobody else will give them any meaning. So only use fields when there is no other way.

> I know this is a lot of questions, but I hope someone can help me
> understand these things a little better.

Congratulations on bringing in all of Wordnet.

Yours,
Martin

Reply all
Reply to author
Forward
0 new messages