一個維基百科的視覺化詞彙界面

6 views
Skip to first unread message

han...@gmail.com

unread,
Feb 26, 2008, 4:19:54 AM2/26/08
to zh_wik...@googlegroups.com
來自日本的研究團隊最近釋出一個維基百科的視覺化詞彙界面, 目前只有英文及日
文版本, 大家可試試,

另外, 由於其視覺化的基礎是靠自然語言的詞頻, 所以若要用在中文上的話, 可能
要有處理斷詞的人加入才行,

由於中文百科全科容納各地書寫的不同用詞, 這種斷詞的研究也可以算是全新領域,

以下轉載原信件, 供各位參考

To Wikipedia researchers,

My name is Dr. Kotaro Nakayama from Osaka University in Japan. I would
like to announce our latest visualization system for Wikipedia
Thesaurus, a huge scale "association" thesaurus, based on Java applet.
It allows users to see the relation network among concepts (articles) in
Wikipedia.

http://wikipedia-lab.org:8080/WikipediaThesaurusV2
(Please click on the "NetVis" to show the network for a concept)

We did not just visualize the link structure of Wikipedia, but we
calculated the relatedness for all concepts in Wikipedia using pfibf
(Path Frequency - Inversed Backward link Frequency) analogous to tf-idf
in information retrieval (See "Wikipedia Mining for An Association Web
Thesaurus Construction" in WISE2007 for more information if you are
interested).

We are conducting several projects such as Wikipedia Thesaurus (Since
2005), Wikipedia Bilingual Dictionary, Wikipedia API and Wikipedia
Ontology. Detailed information about our activities can be found under
the following URL.

http://wikipedia-lab.org

There is no doubt that Wikipedia is an invaluable corpus for knowledge
extraction and this is the time to boost the research up by
collaborating each other. I hope our research is of interest to you.

Sincerely yours,
Kotaro Nakayama

--
*Liao <http://zhongwen.com/cgi-bin/zipux2.cgi?b5=%E5%BB%96>,Han
<http://zhongwen.com/cgi-bin/zipux2.cgi?b5=%E6%BC%A2>-Teng
<http://zhongwen.com/cgi-bin/zipux2.cgi?b5=%E9%A8%B0>*
DPhil student at the OII <http://people.oii.ox.ac.uk/hanteng/about/>(web)
needs you <http://people.oii.ox.ac.uk/hanteng/>(blog)

shi zhao

unread,
Feb 26, 2008, 9:36:04 PM2/26/08
to zh_wik...@googlegroups.com
感觉上似乎是利用内部连接,消歧义和重定向来产生语义关联的,和断词的关系不大吧?

han...@gmail.com

unread,
Feb 27, 2008, 2:53:43 AM2/27/08
to zh_wik...@googlegroups.com
不是吧, 看來是有用到自然語言的詞頻, 而不只是連結而己

shi zhao wrote:
> 感觉上似乎是利用内部连接,消歧义和重定向来产生语义关联的,和断词的关系
> 不大吧?
>
> 2008/2/26, han...@computer.org <mailto:han...@computer.org>
> <han...@gmail.com <mailto:han...@gmail.com>>:
>
> 來自日本的研究團隊最近釋出一個維基百科的視覺化詞彙界面, 目前只有英
> 文及日
> 文版本, 大家可試試,
>
> 另外, 由於其視覺化的基礎是靠自然語言的詞頻, 所以若要用在中文上的
> 話, 可能
> 要有處理斷詞的人加入才行,
>
> 由於中文百科全科容納各地書寫的不同用詞, 這種斷詞的研究也可以算是全

Reply all
Reply to author
Forward
0 new messages