MCMC initialization and convergence

Fouesneau (UW) and I discussed and adjusted the initialization for his ensemble sampling (with emcee) fits of King models to young stellar clusters in the PHAT data. Our pretty consistent experience is that you should initialize the ensemble in a pretty small ball in parameter space and then burn it in to a fair sampling. We also looked at the autocorrelation times, which are not stably measured in short chains, and only stably measured when the chains are long enough that you are properly converged. All of the experience we have developed in MCMC sampling for inference in typical astronomy problems ought to be passed on to the community somewhere! After Foreman-Mackey finishes his current exoplanet paper, we might take a couple weeks and write the how to do MCMC document.


  1. A proper initialization before running MCMC should make convergence to the target distribution faster, and reduce your burn-in time. For the kaggle dark worlds competition, I randomly generated 5,000 sample halos, and choose the one that maximized the log-posterior before starting the mcmc chain. It makes a big difference! My question for you, is there an a priori method for guess-timating your burn-in time before running the sampler?

    Iain Murray's approach to the dark world competition serves as a good primer for doing mcmc properly. Feel free to browse my ipython notebook, which was inspired by his approach: https://bitly.com/103OS2k

  2. "After Foreman-Mackey finishes his current exoplanet paper, we might take a couple weeks and write the how to do MCMC document."

    Is that the same as the "WTF is MCMC" document that's on GitHub?

    BTW the small-ball-around-good-guess approach works great in emcee, but not necessarily for other things. And sometimes finding a good guess is really hard.

  3. If all the initializations are from a small ball, aren't you worried about fooling yourself? If there are multiple ideas for initializing, I would try them all. Do the statistics/predictions of the chains agree? If not, you'll have to do something about it. One idea is to improve mixing somehow. Another idea is to "burn in" with annealed importance sampling (AIS) and weight the results from each run with its AIS weight.

    (BTW I was in "quick hack" mode with my kaggle dark worlds submission. It's not an example of best MCMC practice. I'm sure someone like Brendon or DFM would have done it better!)

  4. Iain: Yes, we try all things we can think of. But if someone doesn't want to go through the pain, and is using emcee (ie, this particular ensemble sampler), and just wants an answer today, we advise small ball. *Everyone* is in "quick hack" mode when it comes to astronomy!