I gave a second talk today at Computing the Universe, this time on nuisance parameter models. I continued my discussion of Gaussian processes and then made the following points: You must fit your nuisance simultaneously with your parameters of interest and you must marginalize them out; you don't want to estimate anything you don't care about. Don't fit and subtract or you will be subject to over- or under-fitting. The most conservative thing you can do is implement a model with a huge number of free parameters; it is not conservative to restrict the freedom of the nuisance model. Gaussian processes don't need to just be processes of spatial position or time; they can be of other features as well.
Again, great talks for the rest of the day; highlights included the following: Abel (Stanford) showed that if you reinterpret n-body simulations as being a set of points that define the corners of mass-filled tetrahedra (rather than mass points), the effective resolution increases and the accuracy for some calculations increases. He also talked about sensible and clever mesh refinement tricks to deliver very high precision at low additional cost. Norvig (Google) talked about the engineering at Google that permits huge operations on huge clusters with lots of contributing people. Press (Texas) gave a tour of all the things that he thought were essential to data analysis in astrophysics that are not in the third edition of Numerical Recipes. He talked about graphical models, kernel methods, Hamiltonian MCMC, the lasso (and other L1 regularizations), Dirichlet processes, the multi-armed bandit, and methods for the solution of systems of linear and non-linear equations. Skillman (Stanford) talked about an immense (trillion-particle) simulation that he has done and fully released on the web for anyone to use. His talk was hilarious in part because the data set is so damned huge. He has built great tools for interacting with the data efficiently. Schneider (Livermore) talked about our MBI entry to the GREAT3 weak-lensing competition. He did a great job of explaining the hierarchical inference, and describing The Tractor and also our importance-sampling inference method.
Interestingly I noticed a few weeks ago that Steve Roberts was recommending the exact opposite approach to GP fitting for astronomical time series: e.g. fit the long baseline noise first with one GP, then fit a second GP with smaller timescale correlations to mop up the residuals. At first I thought this was entirely pragmatic, but in fact tonight I learned from some colleagues that there's some theoretical motivation here to divide the parts of the model that are being constrained (sometimes the restricted model will have a higher Bayes factor, for instance; and more generally there can be a bias-variance trade-off to be considered); its related to the 'cut' function in JAGS.
ReplyDelete