priors on quasar spectra

Hennawi, Tsalmantza, and I had a long conversation about why our likelihood optimization does better at measuring quasar redshifts than our posterior-PDF optimization. In the latter, we use a highly informative prior PDF: That any new quasar spectrum must look exactly like some quasar we have seen before. This is the hella-informative data-driven prior. It turns out it is too strict for our problem: We end up over-weighting quasar spectra that fit the continuum well, at the expense of the narrow features that best return the right redshift. This raises a great philosophical point, one I used to discuss with Roweis extensively: You don't necessarily want to model all of the features of your data well. You want to model well the parts of your data that matter most to your questions of interest. So if we want to use ultra-informative priors, we ought to also up-weight the informative features of the data, and remove the uninformative. This is done, traditionally, by filtering—which is terribly heuristic and hard to justify technically—but which has been done more quantitatively in some domains, once notably by Panter and collaborators in MOPED.

1 comment:

  1. Hi David,

    This would be interesting from a physical perspective as well. Quasar practitioners know that there is a hierarchy of lines that deliver deliver better redshifts than others. For example, high ionization lines like CIV come from near the black hole and can trace outflows and/or winds that result in line assymetries and or self-absorption. Both result in poor redshifts. Balmer lines and low ionization lines tend to come from further out, and so are better tracers of the systemic frame. Low ionization lines are collisionally de-excited in the BLR, and hence are the best tracers of systemic. However, and information theory approach might figure out to best weight the various pieces of information, i.e. all of these line shifts are correlated and the hope is that our modes somehow understand those correlations, but my feeling is that our modes are not as smart as they could be. We are only demanding a good fit, but we have not rewarded or penalized for accuracy about the thing we are most interested in, i.e. the redshift. Maybe you could send us some background on MOPED?