Group meeting featured discussion from our two distinguished visitors this week, Brendon Brewer (Auckland) and Iain Murray (Edinburgh). Brewer described how he and Foreman-Mackey intend to re-do some exoplanet populations inference with something that he calls "joint-space sampling" and in the context of what you might call "likelihood-free inference" (although Brewer objects to even that label) or "approximate Bayesian computation" (a label I despise, because aren't all Bayesian computations approximate?).
The idea is that we have the counts of transiting systems with 0, 1, 2, or etc transiting planets. What is the true distribution of planetary system multiplicities implied by those counts? Brewer calls it joint-space sampling because to answer this question requires sampling in the population parameters, and all the parameters of all the individual systems. The result of the posterior inference, of course, depends on everything we assume about the systems (radius and period distributions, detectability, and so on). One point we discussed is what is lost or gained by restricting the data: In principle we should always use all the data, as opposed to just the summary statistics (the counts of systems). That said, the approach of Brewer and Foreman-Mackey is going to be fully principled, subject to the (strange) constraint that all you get (as data) are the counts.
Murray followed this up by suggesting a change to the ABC or LFI methods we usually use. Usually you do adaptive sampling from the prior, and reject samples that don't reproduce the data (accurately enough). But since you did lots of data simulations, you could just learn the conditional probability of the parameters given the data, and evaluate it at the value of the data you have. In general (his point is), you can learn these conditional probabilities with deep learning (as his RNADE code and method does routinely).
Murray also told us about a project with Huppenkothen and Brewer to make a hierarchical generalization of our Magnetron project published here. In this, they hope to hierarchically infer the properties of all bursts (and the mixture components or words that make them up). The challenge is to take the individual-burst inferences and combine them subsequently. That's a common problem here at CampHogg; the art is deciding where to "split" the problem into separate inferences, and how to preserve enough density of samples (or whatever) to pass on to the data-combination stage.
This was followed by Malz telling us about his project to infer the redshift density of galaxies given noisy photometric-redshift measurements, only just started. We realized in the conversation that we should think about multiple representations for surveys to output probabilistic redshift information, including quantile lists, which are fixed-length representations of pdfs. As I was saying that we need likelihoods but most surveys produce posteriors, Murray pointed out that in general posteriors are much easier to produce and represent (very true) so we should really think about how we can work with them. I agree completely.