SVM, ML, hierarchical, and torques

Fadely dropped in on Brewer, Foreman-Mackey, and me for the day. He has very nice ROC curves for the performance of (1) a support vector machine (with the kernel trick), (2) our maximum-likelihood template method, and (3) our hierarchical Bayesian template method, for separating stars from galaxies using photometric data. His curves show various things, not limited to the following: Hierarchical Bayesian inference beats maximum likelihood always (as we expect; it is almost provable). A SVM trained on a random subsample of the data beats everything (as we should expect; Bernhard Schölkopf at Tübingen teaches us this), probably because it is data-driven. A SVM trained much more realistically on the better part (a high signal-to-noise subsample) of the data does much worse than the Hierarchical Bayesian method. This latter point is important: SVMs rock and are awesome, but if your training data are different in S/N to your test data, they can be very bad. I would argue that that is the generic situation in astrophysics (because your well labeled data are your best data).

In the afternoon, David Merritt (RIT) gave a great talk about stellar orbits near the central black hole in the Galactic Center. He showed that relativistic precession has a significant effect on the statistics of the random torques to which the orbits are subject (from stars on other orbits). This has big implications for the scattering of stars onto gravitaitonal-radiation-relevant inspirals. He showed some extremely accurate n-body simulations in this hard regime (short-period orbits, intermediate-period precession, but very long-timescale random-walking and scattering).

1 comment:

  1. Regarding flexible (non-parametric) data-driven models, there is an enormous phase space to explore between the completely unrealistic "random sample" and the standard spectroscopic subsets that are currently employed as training data. We have shown the effects of using different follow-up strategies for semi-supervised photometric supernova classification (http://arxiv.org/abs/1103.6034); broadly, as you increase the mag limits of the follow-up survey you observe a lot fewer SNe, but classify unlabeled data a lot better! Of course, more sophisticated follow-up strategies, such as active learning, have begun to permeate the literature.

    I recently attended a workshop at KICP for photometric SN classification, and there was lengthy discussion about template fitting vs. non-parametric models. This was a follow-up workshop to the SN Classification Challenge that was run by DES people last year. I was pleasantly surprised that people there were aware of the limitations of template fitting and open to other (or hybrid) approaches. My statement that 'the more we observe, the more we realize that we don't understand' was well taken!