2016-08-29

#AstroHackWeek, day one

They say you shouldn't mess with the timeline! #AstroHackWeek was so busy and full, I ended up not blogging properly during the week, and am writing these blog posts after the fact, based on telegraphic notes taken day-of. This is not uncommon here at Hogg's Research, and, for that, I apologize: Even when I write a post after the fact, I (misleadingly) date it for the day to which it corresponds (and give it a time stamp of one minute before midnight). One of the many reasons that this blog should not be seen as a precise historical document is that these after-the-fact blog posts can certainly be contaminated by present knowledge.

Today was the kick-off day for #AstroHackWeek, our now annual meeting at which participants learn about computational data analysis, and also work on their own computational data analysis projects. This year we had the meeting at the Berkeley Institute for Data Science (and partially supported it with the Moore-Sloan Data Science Environment that spans UCB, UW, and NYU). It was organized (beautifully) by Kyle Barbary (UCB) and Phil Marshall (SLAC).

In the morning today, both I and Jake VanderPlas (UW) spoke, about the basics of probabilistic inference. Then we had what Phil Marshall calls a “stand-up”, at which every participant introduced her or himself, said what it was they wanted to learn, and said what it was they knew well and could help with. They also said what they wanted to do or produce, if there was a well-defined plan.

In the stand-up and early in the hack session, Adrian Price-Whelan (Princeton) talked about joining matched sequential colormaps into diverging colormaps, with one option (or, really, style) emphasizing values near zero, and one emphasizing values far from zero. He had immediate success, and showed some nice results. One amusing thing that might bear fruit later in the week is that the author of the (currently ascendent) Viridis colormap is apparently owner of one of the BIDS desks in our vicinity this week. The conditions on a colormap are many and in tension: There are b/w printer issues, colorblindness issues, there are no-saturate-to-white and black issues, there are small-scale resolution issues, and etc.

I started (perhaps foolishly) two hacks. The first, which I started with Dalya Baron (TAU) and Matt Mechtley (ASU), was to make the demonstration I have of low-photon-rate, direct molecular imaging much more realistic. My demo, which I mention here, is an extreme toy, and there are many directions to make it less toy-like, and improve the internal engineering. I spent time with Baron and Mechtley getting them up to speed on what works and what doesn't, and what needs to change. The easiest change to make first is to go from one-dimensional angle sampling to full three-dimensional sampling in Euler angles (or, equivalently, projection matrices).

My second hack is to somehow, some way, build an MCMC sampler that can successfully and believably sample from all the modes in the multimodal posterior pdfs that we get in standard radial-velocity fitting problems (think: finding exoplanets and binary stars by measuring precise radial velocities). When the observations are sparse, the number of qualitatively different orbital solutions is large, and no sampler that I know of convincingly samples them all. Very late in the day, over coffee, Adrian Price-Whelan, Dan Foreman-Mackey (UW), and I had a very good idea: Sample exactly in a linear problem (mixture of sinusoids) that can be sampled more-or-less analytically, transform those samples into samples at the nearest points in the parameter space of the orbit-fitting problem, and then use importance sampling to get a provably correct (in the limit) sampling from the true posterior pdf. We have a plan for tomorrow!

2 comments:

  1. I've done something like what you propose for RV fitting before (fit a bunch of sinusoids at multiples of the fundamental period, and then invert the Hansen coefficients to get the eccentricity), and I can testify that it works well on toy problems! I wasn't looking for robust sampling; I just had some Lomb-Scargle code and some Hansen coefficient expressions sitting on my hard drive and was too lazy to write a true Kepler solver. But: aren't you just pulling the posterior mode-identification problem back into the problem of choosing a suitable frequency grid for your mixture model of sinusoids and then figuring out all the roots of the non-linear equations that relate orbital parameters to sinusoid amplitudes and phases? Is that problem easier than identifying all the posterior modes (at least for a well-tuned RV MCMC, not for a generic sampler that doesn't know anything about Keplerian orbits)?

    ReplyDelete
    Replies
    1. You are right! As you will see in tomorrow's post, we ended up dropping this idea.

      Delete