Status: New
Owner: ----
Labels: Type-Defect Priority-Medium
New issue 97 by
chmeyer.de: WordNet: semicolon-separated parts of the gloss
are split into multiple definitions
http://code.google.com/p/uby/issues/detail?id=97
Checked the synset {car, auto, automobile, machine, motorcar} and found
that its gloss is split into two definitions:
<Synset id="WN_Synset_15951">
<Definition>
<TextRepresentation languageIdentifier="eng" writtenText="a
motor vehicle with four wheels"/>
</Definition>
<Definition>
<TextRepresentation languageIdentifier="eng"
writtenText="usually propelled by an internal combustion engine"/>
</Definition>
IMHO, this is rather misleading since the WordNet definition is meant as a
whole rather than two (exchangeable definitions). What is the motivation of
having this separated? If we keep it in the current style, we should
definitely document this behavior prominently and inform everyone who is
working with definitions (e.g., for WSD, WSA, etc.) since my approach was -
until today - do just consider the first UBY-WordNet definition...
Comments welcome.
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings