Article: Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis

Skip to first unread message

Aug 31, 2005, 9:52:39 PM8/31/05
JAIR is pleased to announce the publication of the following article:

Cimiano, P., Hotho, A. and Staab, S. (2005)
"Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis",
Volume 24, pages 305-339.

For quick access via your WWW browser, use this URL:

We present a novel approach to the automatic acquisition of taxonomies
or concept hierarchies from a text corpus. The approach is based on
Formal Concept Analysis (FCA), a method mainly used for the analysis
of data, i.e. for investigating and processing explicitly given
information. We follow Harris' distributional hypothesis and model
the context of a certain term as a vector representing syntactic
dependencies which are automatically acquired from the text corpus
with a linguistic parser. On the basis of this context information,
FCA produces a lattice that we convert into a special kind of partial
order constituting a concept hierarchy. The approach is evaluated by
comparing the resulting concept hierarchies with hand-crafted
taxonomies for two domains: tourism and finance. We also directly
compare our approach with hierarchical agglomerative clustering as
well as with Bi-Section-KMeans as an instance of a divisive clustering
algorithm. Furthermore, we investigate the impact of using different
measures weighting the contribution of each attribute as well as of
applying a particular smoothing technique to cope with data

The article is available via:

-- (also see

-- World Wide Web: The URL for our World Wide Web server is
For direct access to this article and related files try:

-- Anonymous FTP from Carnegie-Mellon University (USA):
The compressed PostScript file is named

For more information about JAIR, visit our WWW or FTP sites, or

Steven Minton
JAIR Managing Editor

Reply all
Reply to author
0 new messages