sampling and degeneracies

Brendon Brewer (UCSB), Foreman-Mackey, and I worked on sampling for most of the day, with some calls to Lang. I want to start a challenge called MCMC High Society, which is a bake-off for samplers that can handle ten thousand parameters. However, in the benchmark problem we have, Brewer's DNest sampler is crushing emcee! That might not be surprising, because we have chosen a problem with massive combinatoric degeneracies (with large low-probability valleys separating them) and emcee is not awesome in that situation. Insights from the day include the following: You should measure your sampler's autocorrelation time in the data space (prediction space) not the parameter space (at least if you aren't a realist, and I am not). Ensemble samplers (like emcee) that work with pairs of walkers should keep track of the acceptance fraction as a function of the walker pair involved in the proposal. Multi-modality in the posterior is the norm, not the exception, and it is hard to deal with for any sampler; that's the whole deal. Samplers that estimate the Bayes integral as they sample might be slow for easy problems, but they might be much better for hard problems, and it is hard problems that are game changers.

No comments:

Post a Comment