how precise is The Cannon?

It was a low-research day! But Hans-Walter Rix (MPIA) called me to discuss the endless problem that we don't have label uncertainties that we believe in the output of The Cannon. The context is Ness's work to measure the abundance variability within open clusters (which are famously close to single-abundance populations).

Our formal uncertainties with The Cannon are tiny, but under-estimated because they don't properly account for the choices we made in optimizing the internals. Our cross-validation uncertainties are much better, but still over-estimates because they effectively include systematic terms that go beyond precision. That is, if we only care about precision in a single cluster, the cross-validation is an over-estimate. And we can see that empirically, because with a single-abundance fit we get chi-squared values that are much smaller than the number of degrees of freedom.

My view is that we should use the open clusters themselves to set the uncertainties. This sounds circular: How can we estimate the intrinsic abundance spreads if we set our observational uncertainties assuming that the spreads are zero? But it isn't: For one, different open clusters are different in their abundance spreads. For two, there are long tails of abundance differences even in the best clusters. For three, even if there were neither of these effects, we would still get great upper limits!

The long-term solution is to go fully Bayesian. I became motivated to work on this now. I owe ideas about this to various people, including Jonathan Weare (Chicago).

No comments:

Post a Comment