| || |
Submitted by pscully on Fri, 02/12/2010 - 10:31.
02/18/2010 - 16:10
02/18/2010 - 17:30
STA/BST 290: Fushing Hsieh (Statistics, UC Davis)
Time, temperature and data cloud geometry
Thursday, February 18th, 2010 at 4.10pm, MSB 1147 (Colloquium Room)
Refreshments: 3.30pm, MSB 4110 (Statistics Lounge)
Speaker: Fushing Hsieh (Statistics, UC Davis)
Title: Time, temperature and data cloud geometry
Abstract: We begin with discussing what is global geometric information of possibly high dimensional data cloud without prior knowledge. We explain why ideas of time and temperature are essential to global information given a similarity or affinity measurement. We recognize that fundamental questions: 1) how many clusters?; 2) why local-to-global possible?, must be embedded onto the temperature trajectory, while their resolutions can be developed by incorporating the temporal dynamics through regulated random walks. At each temperature we construct an ensemble of regulated random walks to manifest a node-removal recurrence time profile which reveals not only the number of clusters, but also cluster membership. Collectively, we are able to derive a distribution of number of clusters, and a connectivity matrix. By varying the temperature from very low to very high, we arrive at a trajectory of phase transition that manifests the global geometry by revealing separated hard-core clusters and showing their emergence into soft-core clusters.
Comparisons with existing spectral clustering algorithms are performed through computer experiment and real data analysis.