In a low-research day, I gave a MPIA Galaxy Coffee talk on the sampling in catalog space paper with Brewer and Foreman-Mackey. I emphasized the big-picture issues of having probabilistic deblending results, understanding source populations below the plate limit, and treating catalogs like any other kind of inferential measurement. It is a great project with good results, but it was also computationally expensive, both in the sense of compute cycles, and in the sense of us (or really Brewer) having to know a large amount about high-end sampling methods (reversible jump meets nested sampling) and how to write efficient code. For the future, scaling is an interesting thing to think about: Could we scale up (possibly with a Gibbs or online approach) to a large data set?
"having to know a large amount about high-end sampling methods (reversible jump meets nested sampling) and how to write efficient code."
ReplyDeleteI'm writing a C++ library that implements this in general and allows for the kind of obvious efficiencies that I used in StarField. For example, when proposing to change n stars out of a total of N, don't recompute the whole likelihood from scratch.
It's nowhere near done but lives in RJObject on github. The second branch is where it's at.