probabilistic Cannon

The biggest conceptual issue with The Cannon (our data-driven model of stellar spectra) is that the system is a pure optimization or frequentist or estimator system: We presume that the training-data labels are precise and accurate, and we obtain, for each test-set spectrum, best-fit labels. In reality our labels are noisy, there are stars that could be used for training but they only have partial labels (logg only from asteroseismology, for example), and we don't have zero knowledge about the labels of the unlabeled spectra. This calls for Bayes. Foreman-Mackey drew a graphical model in the morning and suggested variational inference. Late in the afternoon, David Sontag (NYU) drew that same model and made the same suggestion! Sontag also pointed out that there are some new ideas in variational inference that might make the project an interesting project in the computer-science-meets-statistics literature too. Any takers?

No comments:

Post a Comment