Apache Stanbol enhancers

66 views
Skip to first unread message

Thang Chi Duong

unread,
Aug 27, 2013, 4:43:43 AM8/27/13
to lmf-...@googlegroups.com
Hi,

I'm using LMF implement a semantic search for course data. I encounter a problem with Apache Stanbol DBPediaEnhancer enhancer when I want to enhance this text: "This course concentrates on aspects of Java that best demonstrate object-oriented principles and good practice, you’ll gain a solid basis for further study of the Java language and object-oriented software development."
DBPediaEnhancer identifies Java as an island not a programming language. However, according to the Stanbol demo at dev.iks-project.eu:8081/enhancer/chain, it can identified Java as a programming language correctly.

Then, I tried to fix this problem by installing the DBpedia disambiguation-mlt engine via Felix Web console. However, this didn't work since it required the DisambiguatorEngine (but LMF only provides EntityhubLinkingEngine) and when I want to install/update package in the Bundle tab of Felix, it requires authentication. I tried with the default account admin/pass123 but it didn't work.

So, my questions are 
  • How to make DBpediaEnhancer identifie Java as a programming language in the above text ?
  • How to install DisambiguatorEngine ?
  • And what is the username/password for Felix Web console ?

Final words, I would like to thank the LMF developers. In my opinion, LMF is extremely helpful.

Thang

Rupert Westenthaler

unread,
Aug 27, 2013, 8:13:16 AM8/27/13
to lmf-...@googlegroups.com
Hi Thang,

let me try to answer your question:


Am Dienstag, 27. August 2013 10:43:43 UTC+2 schrieb Thang Chi Duong:
Hi,

I'm using LMF implement a semantic search for course data. I encounter a problem with Apache Stanbol DBPediaEnhancer enhancer when I want to enhance this text: "This course concentrates on aspects of Java that best demonstrate object-oriented principles and good practice, you’ll gain a solid basis for further study of the Java language and object-oriented software development."
DBPediaEnhancer identifies Java as an island not a programming language. However, according to the Stanbol demo at dev.iks-project.eu:8081/enhancer/chain, it can identified Java as a programming language correctly.

Then, I tried to fix this problem by installing the DBpedia disambiguation-mlt engine via Felix Web console. However, this didn't work since it required the DisambiguatorEngine (but LMF only provides EntityhubLinkingEngine) and when I want to install/update package in the Bundle tab of Felix, it requires authentication. I tried with the default account admin/pass123 but it didn't work.

So, my questions are 
  • How to make DBpediaEnhancer identifie Java as a programming language in the above text ?
The label for "Java" the programming language is "Java (programming language)" in dbpedia. There are some additional redirects (including "Java Programming" ... that should be extracted for the above text  with the context '... study of the Java language and object-oriented ...'. But the requirement of Wikipedia/DBpedia to have unique labels does not help in this case with linking.

There is also an entity for "Java (software platform)" with a redirect from "Java™". This is actually suggested for the first mention of Java in your text. However as the label does not 100% fit with the one used in the text the confidence is set by the Entityhublinking engine to 0.64.

BTW: when linking against Freebase "Java" is linked 1. with the computer language 2. with the island and 3. with a French rap group.

  • How to install DisambiguatorEngine ?

Just add the bundle to the Felix Web console. You might also need to relax the configuration of the EntityhubLinking engine to get more suggestions for disambigutation
  • And what is the username/password for Felix Web console ?
Let user and pwd empty. Make sure do deactivate security in the LMF when configuring Stanbol. Otherwise you will not be able to access the Felix Web console.

best
Rupert

Thang Chi Duong

unread,
Aug 28, 2013, 5:18:08 AM8/28/13
to lmf-...@googlegroups.com
Thank you for your reply. I was able to modify Stanbol to make it identify Java as a programming language. However, Java as an island is still ranked first.

Another question I would like to ask is that: when I use the "language" enhancement chain for the previous text, it didn't return any entity except the Raw RDF output. This make the fn:stanbol function also return nothing. 

However, the "language" ehancement chain at http://dev.iks-project.eu:8083/enhancer/chain/language can extract the "en" language entity. 

Where is this entity stored ? And how I can make my local stanbol server to extract this language entity ?

Best regards,
Thang
Reply all
Reply to author
Forward
0 new messages