About WSD using wordnetResource and WordnetSenseKeySenseinventory

42 views
Skip to first unread message

jvi...@expansion.com.mx

unread,
Jun 20, 2016, 6:10:52 PM6/20/16
to DKPro WSD users
Hello!
I'm using dkpro wsd and wordnet 3.0.

What I want to do is use Lesk algorithm to disambiguate the the word given the sentence as context, and then use that Id to get the WordnetResource Entity to get its hypernyms.

So far it works pretty well, the problem is after I disambiguate the word, I get a Weird object in which the node id is in a different format than wordnet's (WordnetResource):

-WordnetResource (Entity): buy#13253751
-WordNetSenseKeySenseInventory (Word): buy%1:21:00::

I tried everything to form the id from one to the other but haven't done it yet, any ideas?
Ty for your time!
Some code:

I try to match it like this:
Word wword = lesk.getBestSense(theWord, context,POS.NOUN);
ent = wordnet.getEntityById(wword.getSenseKey());

in which :

public class WsdLeskAlgorithm {
private SenseTaxonomy inventory;
private LexicalSemanticResource wordnet = null;
private SimplifiedLesk lesk = null;


public WsdLeskAlgorithm(LexicalSemanticResource wordnet) throws IOException, JWNLException, LexicalSemanticResourceException {

this.wordnet = wordnet;
this.inventory = new WordNetSenseKeySenseInventory(
new ClassPathResource("wordnet/wordnet_properties.xml").getInputStream());

this.lesk = new SimplifiedExtendedLesk(inventory, new DotProduct(),
new MostObjects(), new StringSplit(),
new StringSplit());

System.out.println("Inventory: " + inventory.toString());
}



public Map<Word,Double> getAllSenses(String word, String context,POS pos) throws SenseInventoryException, JWNLException {
Map<Word,Double> probMap = new HashMap<>();

if (lesk != null && wordnet != null) {
Map<String, Double> senseProbmap = lesk.getDisambiguation(word, pos, context);
for (String s : senseProbmap.keySet()) {
Word disamWord = ((WordNetResource) wordnet).getDict().getWordBySenseKey(s);
probMap.put(disamWord, senseProbmap.get(s));
}
}
else
throw new RuntimeException("Lesk or Wordnet was null!!");
return probMap;
}

public Word getBestSense(String word,String context, POS pos) throws SenseInventoryException, JWNLException, LexicalSemanticResourceException {

Word best = null;

Map<Word,Double> probMap = this.getAllSenses(word,context,pos);
if (probMap.size() == 1){
return new ArrayList<>(probMap.keySet()).get(0);
} else if (probMap.size() > 1){
double current = 0.0;
for ( Word w: probMap.keySet()){
if (probMap.get(w) > current){
best = w;
current = probMap.get(w);
}
}
}

return best;
}

}

Tristan Miller

unread,
Jun 21, 2016, 9:56:24 AM6/21/16
to dkpro-w...@googlegroups.com
Greetings.

On 21/06/16 12:10 AM, jvi...@expansion.com.mx wrote:
> I'm using dkpro wsd and wordnet 3.0.
>
> What I want to do is use Lesk algorithm to disambiguate the the word
> given the sentence as context, and then use that Id to get the
> WordnetResource Entity to get its hypernyms.
>
> So far it works pretty well, the problem is after I disambiguate the
> word, I get a Weird object in which the node id is in a different format
> than wordnet's (WordnetResource):
>
> -WordnetResource (Entity): buy#13253751
> -WordNetSenseKeySenseInventory (Word): buy%1:21:00::
>
> I tried everything to form the id from one to the other but haven't done
> it yet, any ideas?

It looks like your code is trying to mix and match different interfaces
to WordNet. In some places you are using the DKPro LSR interfaces (from
the si.lsr module) and in other places you are using the extJWNL
interface (from the si.wordnet module). You should pick one or the
other, since the sense identifiers returned by these two interfaces
aren't compatible. If your program doesn't need to use any lexical
semantic resources other than WordNet (say, GermaNet, Wikipedia, or
Wiktionary), then I would advise that you use the extJWNL interface, as
this one returns the native WordNet sense keys and synset IDs.

Regards,
Tristan

--
Tristan Miller, Research Scientist
Ubiquitous Knowledge Processing Lab (UKP-TUDA)
Department of Computer Science, Technische Universität Darmstadt
Tel: +49 6151 162 5296 | Web: https://www.ukp.tu-darmstadt.de/

signature.asc

Jose Luis Vieyra Sagaon

unread,
Jun 21, 2016, 10:39:33 PM6/21/16
to dkpro-w...@googlegroups.com
Thank you!
I'm using now your LsrSenseInventory now to work only with that interface!
Just some bean xml problems and I'll done!

Thank you for your time :)


--
You received this message because you are subscribed to a topic in the Google Groups "DKPro WSD users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dkpro-wsd-users/vZbLau9L-FQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dkpro-wsd-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Mto. Ing. José Luis Vieyra S.
DataScience Manager @ GEx
Ext. 75725


Este correo y sus archivos adjuntos son personales, privilegiados y confidenciales y están destinados exclusivamente para el uso de la persona a la que están dirigidos. Si usted recibió este correo por error, le agradeceremos regresarlo al remitente notificando dicho hecho y borre el presente y sus anexos de su sistema sin conservar copia de los mismos. Si usted no es el destinatario al que este correo está dirigido, queda usted notificado que la difusión, distribución o el copiado de este correo y de sus archivos adjuntos está prohibido.
   
 AVISO: Hemos tomado las precauciones razonables para prevenir que este correo esté infectado por virus. No aceptamos responsabilidad alguna por daños o pérdidas causadas por el uso de este correo o de los archivos adjuntos al mismo.

 This e-mail and any attachments transmitted with it are personal, privileged and confidential and solely for the use of the individual to whom they are addressed and intended. If you have received this e-mail in error, please notify the sender by return e-mail and delete this message and its attachments from your system without keeping a copy. If you are not the intended recipient, you are hereby notified that the dissemination, distribution or copying of this e-mail and attachments transmitted with it is strictly prohibited.
   
 NOTICE: We have taken reasonable precautions to prevent viruses from being present in this e-mail. We do not accept responsibility for any loss or damage arising from the use of this e-mail or attachments.

Jose Luis Vieyra Sagaon

unread,
Jun 22, 2016, 11:25:48 AM6/22/16
to dkpro-w...@googlegroups.com
Hello!

@Tristan:

I've been trying to make it work how ever when I run it like this:
LsrSenseInventory lsr = new LsrSenseInventory("wordnet/wordnet_properties.xml","en");

However I got an exception asking for a resources.xml, I checked in your code to see if you have one example of the xml but I found none, moreover all the tests are ignored!

Do you have a working example of LsrSenseInventory?

Thank you for your time!

The link to the tests: https://github.com/dkpro/dkpro-lsr/blob/1b579f22560fc16d245d743dc2daf22b8cc5d571/de.tudarmstadt.ukp.dkpro.lexsemresource.core-asl/src/test/java/de/tudarmstadt/ukp/dkpro/lexsemresource/core/ResourceFactoryTest.java

And the exception I get:

Caused by: java.lang.RuntimeException: de.tudarmstadt.ukp.dkpro.lexsemresource.exception.ResourceLoaderException: Unable to locate configuration file [resources.xml] in [DKPro workspace not available, Classpath: resources.xml]
    at dl.nlp.uima.annotators.wordnet.WordnetHypernymAnnotator.initialize(WordnetHypernymAnnotator.java:51)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:262)
    ... 38 more

Tristan Miller

unread,
Jun 22, 2016, 12:19:45 PM6/22/16
to dkpro-w...@googlegroups.com
Dear Jose,

On 22/06/16 05:25 PM, Jose Luis Vieyra Sagaon wrote:
> I've been trying to make it work how ever when I run it like this:
> LsrSenseInventory lsr = new
> LsrSenseInventory("wordnet/wordnet_properties.xml","en");
>
> However I got an exception asking for a resources.xml, I checked in your
> code to see if you have one example of the xml but I found none,
> moreover all the tests are ignored!
>
> Do you have a working example of LsrSenseInventory?

I'm afraid that DKPro LSR isn't very well documented -- or at least, not
any more. This library requires that you create a special XML file that
documents the location of your WordNet installation. I think an example
file may have been provided back when the project was hosted on Google
Code, but I can't find one on its current home on GitHub. (I've raised
an issue about this here: <https://github.com/dkpro/dkpro-lsr/issues/8>)

To give a quick answer: First you need to make sure to set an
environment variable $DKPRO_HOME, whose value is some folder in your
home directory. (For example, ~/dkpro.) Then you need to create a file
named
$DKPRO_HOME/de.tudarmstadt.ukp.dkpro.lexsemresource.core.ResourceFactory/resources.xml.
The contents should be as follows:

<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd">

<bean

class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
</bean>

<bean id="wordnet-en" lazy-init="true"
class="de.tudarmstadt.ukp.dkpro.lexsemresource.wordnet.WordNetResource">
<constructor-arg
value="${HOME}/share/WordNet/WordNet-3.0/extjwnl18_properties.xml"/>
</bean>
</beans>

Change "${HOME}/share/WordNet/WordNet-3.0/extjwnl18_properties.xml" to
the location of your extJWNL properties file.

Then, with your $DKPRO_HOME environment variable set, launch your DKPro
WSD application (or Eclipse, if you're running from there). You should
now be able to create an LsrSenseInventory as follows:

LsrSenseInventory lsr = new LsrSenseInventory("wordnet","en");

You probably noticed that this is a lot of work! DKPro LSR is a
somewhat antiquated component -- it's about ten years old; it is not
under very active development and so not a lot of work has gone into
documenting it and making it convenient to use. As I mentioned in my
previous e-mail, you should consider instead using the native extJWNL
interface to WordNet, which you can find in the si.wordnet module of
DKPro LSR. Then you won't need to fiddle around with any
arbitrarily-located configuration files, other than the extJWNL
properties file itself.
signature.asc

Jose Luis Vieyra Sagaon

unread,
Jun 22, 2016, 3:58:08 PM6/22/16
to dkpro-w...@googlegroups.com
Thank you very much Tristan It's working now!

--
You received this message because you are subscribed to a topic in the Google Groups "DKPro WSD users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dkpro-wsd-users/vZbLau9L-FQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dkpro-wsd-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Mto. Ing. José Luis Vieyra S.
DataScience Manager @ GEx
Ext. 75725
Reply all
Reply to author
Forward
0 new messages