Each kind of bacteria has its own colony based on the same morphology. Similarly, each concept has its own colony. HowNet-based Sense_Colony_Tester (SCT) can represent and thus display the relevancy of a concept, or sense, to the other concepts in a given context.
SCT is bilingual, English or Chinese. Let’s look at the following demo.
Suppose your text to be tested is:
“I usually have bread and butter and homemade jam for breakfast.”
After you input your text into the “Source Text” Box, and select
In the “Result Tree” box, all the concepts or senses of the given text will be shown. We call this processing a text-CT, as shown by Fig. 2. If only words rather than concepts or senses can be displayed, it is called text-X-ray as shown by Fig 1.
Fig. 2
SCT’s critical function is to give two statistical results: (1) Relevancy Count for every concept of the words in the tested text; (2) Distance Value of every concept from its immediate relevant neighbor.
Fig. 3
Therefore, SCT can be applied to WSD, especially for those types of discourse ambiguities. SMT is employed in HowNet-based machine translation system. Fig. 3 shows SCT will choose the sense of “eat” for the word “have”, cf. Fig. 4. And Fig. 5 shows SCT will choose the sense of “food” for the word “jam”, cf. Fig. 6. The relevancy count of the sense of food in the word “jam” is 0004 which is the highest of its three senses. We suppose you may know what senses are the voters for this: “eat”, “drink” of the word “have”, and “food” of “bread and butter” and “eat” in the word “breakfast”.
Fig. 4
Fig. 5
Fig. 6
SCT can also serve as an effective tool for text summarization, text topic identification, text categorization, etc. tasks based on meaning computation. Fig. 7,8 and 9 show the topic of the given text may be some kind of medicine and its relatedness to a symptom and infection, the statistical results of the concepts, the following concepts enjoy higher priority in Relevancy Count (more than 0010):
medicine: 0024 (0025) treat: 0017
symptom: 0024 infection: 0018
cell: 0024 fungus: 0012
skin: 0022 cell membrane: 0012
Fig. 7