The Cannon with noisy and missing labels

Dan Foreman-Mackey showed up in Heidelberg today. Hans-Walter Rix and I interviewed him about his ideas around training The Cannon when the training-set objects have noisy labels. We came up with some simple ideas, and Foreman-Mackey thinks it might even be possible to just sample the whole thing. I'm skeptical, but it is also the case that Jonathan Weare (Chicago) and Charles Matthews (Chicago) both also said the same thing to me when I spoke in Chicago this past Spring.

In related news, Anna-Christina Eilers (MPIA) is working with The Cannon in a context in which some of her training-set objects are completely missing some labels. We discussed how to simultaneously optimize the internal model parameters and the missing label values. It should work, but the conversation really reminded me of the regrettable point that The Cannon is just a maximum-likelihood system!

In a long conversation in Rix's office, Melissa Ness, Rix, and I drew a graphical model for stellar parameter estimation in the age of Gaia. Rix has an intuition that we are going to want to use different parameters when we are combining photometry, spectroscopy, and astrometry. I think he is right. Is our probabilistic graphical model publishable?

No comments:

Post a Comment