Title: Clustering and Structural Analysis of High Dimensional Data on Manifold
Abstract: Density and geometry of high-dimensional biology and image data (often assumed to lie on a noisy non-linear manifold) are often leveraged in various clustering algorithms, but the interplay of them are often overlooked. The overarching goal is to harness this interaction to improve the understanding of the data and improve the clustering results.
The first part of the talk would be about "Balanced Centrality" of a graph and how it can be used to identify the most clusterable and well-shaped part of the clusters. A new balanced centrality method motivated from the concept of "Justified Representation" from social choice theory will be introduced. A novel clustering algorithm based on it is also introduced.
The second part of the talk is on the intrinsic structural analysis of the "cores" of each cluster. Various experiments to reveal the underlying geometry of different parts of each cluster will be presented to better understand the reason behind improvements in clustering results.