model spectrum grids, combining data

Fouesneau made progress this week—and showed some results today—on a challenging project: Take a grid of stellar spectra (a grid in mass, age, reddening, metallicity, and so on), find groups of spectra that are (from our perspective) identical, combine them, and modify the code to deal. Identical from our perspective means cannot be distinguished with percent-level six-band photometry in the PHAT bands. Modify the code to deal means do all the index-wrangling so that we can do our raw chi-squared calculations on the minimal set of model spectra, but do all our probabilistic inference on the full set. Well not the minimal set, since finding that is (probably) NP-hard. Anyway, it is a non-trivial search problem (we are using ball trees!) and it is a non-trivial code change, but Fouesneau is close.

Over after-work drinks, Foreman-Mackey, Fouesneau, Weisz, and I discussed the (different) fitting project in which we combine spectroscopy and photometry. The question of weighting came up: How do we relatively weight the spectroscopy and the photometry, given that there are thousands of spectral pixels but only a few bands of imaging? The answer is: You don't! Weighting the chi-squared values before combining them is like taking the likelihood to a power, which makes little sense. The concerning issue is that we don't trust the spectroscopy as much as the imaging, so the larger number of pixels is disturbing (spectroscopy always dominates the chi-squared calculation). My answer is that you have to deal with that lack of trust by complexifying the model. The flexible spectrophotometric calibration functions we are fitting along with the spectral properties (see yesterday's post) parameterize our distrust, and also effectively downweight the spectra in their importance in the combined fit: A good chunk of the spectral information is being drawn away from the spectral properties and on to the nuisance parameters.


  1. Hi David,

    we had this problem when fitting spectral energy distributions of the torus emission in AGN. Weigthing the spectroscopy data less than the photometric data, as is usually done, is equivalent to increasing the variance of the spectroscopic data. In essence, this means that you trust less spectroscopy data than photometric data. In the unrealistic case that we have a good estimation of the uncertainty in both photometric and spectroscopic data, there is no problem and one can do inference using the Gaussian likelihood. One can even try to apply to do a Bayesian weighting of the data to compensate for the number of points, but finally everything reduces to having good estimation of the uncertainties.


  2. Gravitational lensing people have the same issue. Thousands of imaging pixels, and a couple of dynamics numbers, but intuitively the imaging isn't thousands of times more informative.

    The answer is that, according to the simplistic models we use, the imaging *is* thousands of times more informative. Get a better model.

    However, there is an interpretation of raising likelihoods to a power that makes sense, so you can continue to use the simplified model and believe the results. I would like to discuss this when I visit.