MaxEnt2013, day 1

Today was the first day of MaxEnt 2013 in Canberra, Australia. There were many great talks, including about exponential-family probability distributions and their generalizations, image reconstruction from geophysical and medical imaging, model selection via marginalized likelihood, and inference and decision making for networks and flow. There were also amusing discussions of the "surprise test paradox" and the "black raven paradox" and other simple probability arguments that are very confusing. These conversations carried into dinner, at which Iain Murray (Edinburgh) and Brewer and I argued about their relevance to our understanding of inference.

The most productive part of the day for me was at lunch (and a bit beyond), during which Murray, Brewer, and I argued with various attendees about the various topics of discussion I brought to Australia to discuss with Murray. Among the various things that came up are GPLVM (by Neil Lawrence) as a possible tool for Vakili and me on galaxy image priors, Gaussian Processes for not just function values but also derivatives (including higher derivatives) and the potential this has for "data summary", and MCMC methods to find and explore badly multi-modal posterior pdfs. We spent some significant time discussing how to make likelihood calculation more efficient or adaptive. In particular, Murray pointed out that if you are using it in a Metropolis–Hastings accept/reject step, how precisely you need to know it depends on the value of the random number draw; in principle this should be passed into the likelihood function! We also spent some time talking about how large-scale structure is measured. Murray had some creative ideas about how to use better the existing cosmological simulations in the data analysis.


  1. Can you actually get much from the less-accurate likelihood? The main saving is going to be if the proposed point is better, so you only need to know that before accepting, but then you need the likelihood at that point as part of the computation for every step until the next acceptance.

    1. Good point: the implementation can be cumbersome if only bounding the likelihood of accepted points, and one might have to go back and refine them. I don't do it. Hogg had a situation where refining the likelihood computation to high precision looked *expensive*, but could easily be successively bounded. It seemed a good fit to the idea I learned from Gareth Roberts.

      There is a saving from noticing that the random draw is not extreme enough to accept a bad proposal. Acceptance probabilities (depending on method) are often <0.5, allowing proposals with large step-sizes. (For standard dumb Metropolis, 0.234 is an accepted target rate.)

      A related idea is to make larger sloppy proposals, but reject many based on any cheap approximation (not necessarily a bound) before checking properly. References to two-stage accept rules were reviewed in §2.1.3, p30 of my PhD dissertation. A slice sampling variant was buried on p40. http://homepages.inf.ed.ac.uk/imurray2/pub/07thesis/murray_thesis_2007.pdf