100 Q&A's on HowNet -- 008. What is the HowNet-based tool called Sense_Colony_Tester and what tasks can it be applied to?

137 views
Skip to first unread message

xish...@gmail.com

unread,
Aug 22, 2014, 12:06:24 PM8/22/14
to how...@googlegroups.com

Each kind of bacteria has its own colony based on the same morphology. Similarly, each concept has its own colony. HowNet-based Sense_Colony_Tester (SCT) can represent and thus display the relevancy of a concept, or sense, to the other concepts in a given context.

SCT is bilingual, English or Chinese. Let’s look at the following demo.

Suppose your text to be tested is:

 I usually have bread and butter and homemade jam for breakfast.

After you input your text into the “Source Text” Box, and select  

English or Chinese and then click “submit”.

 
                                              Fig. 1

In the “Result Tree” box, all the concepts or senses of the given text will be shown. We call this processing a text-CT, as shown by Fig. 2. If only words rather than concepts or senses can be displayed, it is called text-X-ray as shown by Fig 1.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                                              Fig. 2

SCT’s critical function is to give two statistical results: (1) Relevancy Count for every concept of the words in the tested text; (2) Distance Value of every concept from its immediate relevant neighbor.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                                              Fig. 3

Therefore, SCT can be applied to WSD, especially for those types of discourse ambiguities. SMT is employed in HowNet-based machine translation system. Fig. 3 shows SCT will choose the sense of “eat” for the word “have”, cf. Fig. 4. And Fig. 5 shows SCT will choose the sense of “food” for the word “jam”, cf. Fig. 6. The relevancy count of the sense of food in the word “jam” is 0004 which is the highest of its three senses. We suppose you may know what senses are the voters for this: “eat”, “drink” of the word “have”, and “food” of “bread and butter” and “eat” in the word “breakfast”.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                                              Fig. 4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                                              Fig. 5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
 

 

 

 

 

 

 

 

 

 

                                              Fig. 6

 

SCT can also serve as an effective tool for text summarization, text topic identification, text categorization, etc. tasks based on meaning computation. Fig. 7,8 and 9 show the topic of the given text may be some kind of medicine and its relatedness to a symptom and infection, the statistical results of the concepts, the following concepts enjoy higher priority in Relevancy Count (more than 0010):

medicine:     0024 (0025)               treat:            0017

symptom:      0024                      infection:        0018

cell:         0024                      fungus:           0012

skin:         0022                      cell membrane:    0012

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                                              Fig. 7

 

 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
                                              Fig. 8

 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
                                              Fig. 9

 

 

 

 

 
 
 
Reply all
Reply to author
Forward
0 new messages