Bovy has re-written all our "infer d-dimensional distribution functions when you have noisy data with missing values" code in C and it appears to be much faster than the (heroic) code written by Roweis and Blanton back in the day when we were all discovering the value of pair coding
. Bovy and I spent some time discussing split and merge
, which is a method for exploring mixture-of-gaussians models when you think you might be stuck in a local minimum.
Bovy and I also discussed the problem of comparing millions of SDSS spectra to one another in finite time. We figured out that the full N-squared calculation would take a year even if we coded it in machine language, so we want to do the full comparison only after trimming the tree with some reliable heuristics. We came up with a straw plan, but I am suspicious about its reliability (that is, we don't want to trim valid leaves) and effectiveness (that is, we want to massively speed things up).
No comments:
Post a Comment