k-means, MML, mixtures

When I crashed Bovy's office today, he was working on a minimum message length application for the k-means data clustering algorithm. Because k-means is not usually thought of as a data model, it is a little strange to apply MML, but we are interested in comparing PCA to k-means and assessing scaling and other properties from the point of view of data compression or data summary. We also discussed our usual basket of topics, but notably implementing an MCMC-optimized mixture of gaussians model, which would have some (inferential, but not speed) advantages over EM.

No comments:

Post a Comment