fitting a curve to data (on the sky)

With Price-Whelan and Jagannath, and also now with Lang for our April-Fools' attempt (may fail), we are facing the problem of comparing a line of points on the sky (stars or measurements) with a line of points generated by a model (a sampling of a curve). This is a classic issue in astronomy (or any science) and yet there is no widely accepted probabilistic descriptions of this problem in general (that is, when the curve is arbitrary and can do things like cross and kink). In astronomy, if the curve is an orbit, it lives in phase space too, so really the comparison is being done in six-dimensional space, but where some of the dimensions are unmeasured for most of the points or stars.

Polemical notes I could make but will restrain myself include the following: There are many odd and wrong choices in the literature for comparing streams of stars to orbit calculations, we should correct those. In general, looking at the distance between a data point and the closest point on the curve is the wrong thing to do; the investigator should be marginalizing over a choice of points. Distance must be defined with a metric, which is like the inverse covariance matrix describing the data-point uncertainty, or that convolved with a model width or uncertainty. There is no difference in principle (or in practice) between missing data and badly measured data, so there is a covariance approach that deals with the 6-d fit on the 2-d sky if you think about it right.

Lang and I spent much of the day discussing these issues and coming up with an approximate likelihood for the problem we are working on. I have so much to say about all these issues I should write a paper, but I almost certainly won't.


  1. Does it count as research if you'll never write it up?

  2. Steve: You ask a deep question!