Fergus, Neil Zimmerman (MPIA), and I chatted on the phone a bit today. Zimmerman wants to write a proposal to build a forward model of a coronographic spectrograph (think P1640 or GPI); he has the intuition (which I share) that if you built such a model, you could calibrate the data far better. Right now calibration is performed before and after observing; the science frames are expected to be in agreement with the calibration meta-data or some interpolation of it, and the data are "corrected" or transformed from the instrument coordinates (two-dimensional pixels) to some calibrated object (three-dimensional imaging spectroscopic boxels). But since flexure and temperature changes can be non-trivial, and since the science frames contain so many photons, it would be better to learn the calibration of the spectrograph from the appropriate combination of the calibration and science data, and it would be better to perform the comparison between models and data at the pixel level. That's a theme of this blog, of course. We discussed what kinds of small, toy, demonstration systems could show this, convincingly enough for a proposal, relevant to the real thing, but easy to set up and use as a little sandbox.
I spent the day at STScI, giving a talk about hierarchical inference, hosted by Lou Strolger (STScI), and also chatting with various. There is so much going on at STScI and JHU; it was a busy day! One theme of my conversations was calibration (of course); CampHogg and STScI are aligned in wanting to make calibration simultaneously more precise and less time-consuming (as in less consuming of observing time). Another theme was the short life of JWST; as a non-serviceable facility with expendables, it has a finite lifetime. This puts pressure not just on calibration, but also on every possible science program. We have to use this facility efficiently. That's a challenge to the whole community, but especially the many teams at STScI.
Who would have thunk it: I have spent the last 25 years doing astrophysics in some form or another, and now I am preparing to co-write a paper on computing the determinants of matrices. Foreman-Mackey and I met with Mike O'Neil (NYU) and Sivaram Ambikasaran (NYU) (both Applied Math) today about making determinant calculations fast. The crazy thing is that linear algebra packages out there are happy to make matrix inversion fast, but they uniformly discourage, disparage, or express incredulity about the computation of determinants. I understand the issues—determinants have ungodly units and therefore ungodly magnitudes—but we need to compute them if we are going to compute Gaussian probability densities. Our matrices are somewhat sparse, but the key idea behind Ambikasaran's method is that the matrices are smooth (columns are good at predicting other columns), or, equivalently, that the matrices have low-rank sub-matrices inside them. Plus fastness.
MJ Vakili (NYU) showed me today what he has been working on to generate a data-driven prior probability distribution over galaxies. It is great work. He finds that he can do a shapelet decomposition, truncate it, and then do a dimensionality reduction (again, as it were), and then fit the resulting distribution of components with a mixture of Gaussians. We have yet to show that the model is good, but when he samples from it, the samples look like actual galaxies. The point is this: If you want to measure a shear map (or anything else, for that matter) from galaxy images, you can't do proper inference if you don't have a prior over galaxy images. So we are playing around with the possibility of making one.
In the brown-bag today, Craig Lage (NYU) showed detailed simulations he is doing of the "Bullet Cluster". He is doing by-hand likelihood optimization, with an hours-long simulation inside the loop! But the results are gorgeous: He can reproduce all the large-scale features, and a lot of the small-scale details. He says it isn't a challenge to CDM, but it is a challenge to theories in which there is no dark matter. One of his goals is to test dark-matter interactions; it looks very promising for that.
On the airplane home from "The AD", I wrote in our paper about a data-driven model for the Kepler focal plane. I wrote about the following issue: This model is a data-driven, flexible model of the pixels telemetered down from the spacecraft. As such, the model doesn't contain anything within it that could be interpreted as the "flat-field" or as the "point-spread function" or a "source", let alone a source "brightness". But it is a good model! The question is: How to extract photometry? We have a plan, but it is debatable. The fundamental issue is that data-driven models are, almost by definition, uninterpretable (or at least not straightforwardly interpretable). Insane.
I spoke to the NYUAD Physics Department and related parties in a research seminar about inference and data-driven models, and then in the early evening I gave a public talk for the NYU Abu Dhabi Institute. In the latter forum I spoke about Dark Matter: What we know about it, how we know it, and what might come next. I got great questions and a lively multi-hour discussion with audience members (from a remarkable range of backgrounds, I might add) followed my talk.
I put lots and lots of (proverbial) red ink onto two papers. One is Hou's paper on diffusive nested sampling (with the stretch move) to compute fully marginalized likelihoods and inform decision-making about exoplanets and follow-up. The method is principled and accurate (but very slow). Hou has implemented, and clearly explained, a very complicated and valuable piece of software.
The other is Lang's paper on building new, better, deeper, and higher-resolution co-adds (combined imaging) from the WISE Satellite data. He included in the paper some of our philosophy about what images are and how they should be modeled and interpreted, which pleased me greatly. He is also delivering a data set of enormous value. Got infrared needs?
Jasper Hasenkamp (NYU) gave the brown-bag, about fixing anomalies between large-scale-structure cosmology results and cosmic-microwave-background cosmology results using mixed dark matter—the standard CDM model plus a small admixture of a (possibly partially thermalized) neutrino-like species. The model seems to work well and will make new predictions, including (in principle) for accelerator experiments. Mark Wyman (NYU) has also worked on similar things.
At the "No More Tears" phone-con (about Kepler planet-searching), we talked about wavelets with Bekki Dawson (Berkeley) and other exoSAMSI participants. In our MCMC meeting, we worked on finishing Hou's nearly finished paper on nested sampling, and we quizzed Goodman about mixing the stretch move (the underlying engine of emcee) with Metropolis-Hastings to capitalize on the observation that in most likelihood functions there are "fast" and "slow" parameters, where the "fast" parameters can be changed and the likelihood call re-made quickly, while the "slow" parameters require some large, expensive re-calculation. This is generic, and we came up with some generic solutions. Some of them are even permitted mathematically. Foreman-Mackey is thinking about these in the context of running n-body simulations within the inference loop
In other news, Lang delivered a draft paper about his work on the WISE imaging, and Fadely had some ideas about finding nails for our factor-analysis hammer.
Bonaca (Yale), Geha (Yale), Johnston (Columbia), Kuepper (Columbia), and Price-Whelan all came to NYU to visit CampHogg to discuss stream-fitting. We (like everyone on Earth, apparently) want to use streams to constrain the potential and accretion history of the Milky Way. Kuepper and Bonaca are working on simulation methods to make fake streams (quickly) and compare them to data. Price-Whelan is working out a fully probabilistic approach to generating stream data with every star carrying a latent variable which is the time at which it was released from the progenitor (this is my "Bread and Butter" project started at the end of this summer after sessions with Binney, Bovy, Rix, Sanders, and Sanderson). We have hopes of getting good inferences about the Milky Way potential (or acceleration field or mass density) and its evolution with time.
Price-Whelan and Foreman-Mackey spent some time coming up with very clever Gibbs-like strategies for sampling the per-star latent parameters (release time and orbit for each star) in an inner loop with an outer loop sampling the potential and progenitor parameters that are shared by all stars. In the end, we decided to de-scope and write a paper with brute-force sampling and a small data set. Even at small scope, such a paper (and software) would be state-of-the-art, because what we are doing treats properly missing data and finite observational uncertainties, which will be a first (unless Bovy or Sanders has scooped us?).
At lunch, I asked the team to say what a stream really constrains: Is it the potential, or the acceleration field, or the density? Clarity on this could be useful for guiding methods, expansions, parameterizations, and so on. In the afternoon, Geha, Johnston, and I also talked about joint funding opportunities and outlined a proposal.
I had a short conversation today with NYUAD (Abu Dhabi) undergraduate Jeffrey Mei about some work he has been doing with me to infer the extinction law from the standard stars in the SDSS spectroscopy. This is one of my ideas and seems to be working extremely well. He has built a generative model for the spectroscopy and it produces results that look plausibly like a dust attenuation law with some tantalizing features that could be interstellar bands (but probably aren't). I outlined a possible paper which he is going to start writing.
Late in the day, Paul Chaikin (NYU) gave a talk about the astounding experiments he has been doing with people in physics, chemistry, and biology to make artificial systems that act like life. He has systems that show motility, metabolism, self-reproducibility, and evolution, although he doesn't have it all in one system (yet). The systems make beautiful and very clever use of DNA, specific binding, enzymes, and techniques for preventing non-specific or wrong binding. Absolutely incredible results, and they are close to making extremely life-like nano-scale or microscopic systems.
In our semi-weekly arXiv coffee, Fed Bianco (NYU) showed us some papers about Pluto, including an occultation study (Pluto occults a background star) and the implications for Pluto's atmosphere. But then we got onto occultations and she showed us some amazing Kuiper-Belt-object occultation data she has from fast cameras on Hawaii. The coolest thing (to me) is that the occulters are so tiny, the occultations look different from different observatories, even Haleakala to Mauna Kea! Tycho Brahe would have loved that: The effect could have been used to prove (pretty much) the heliocentric model.
I spent a good chunk of the afternoon at the brand-new Simons Center for Data Analysis with applied mathematicians Leslie Greengard (Simons, NYU) and Mike O'Neil (NYU), talking about big matrices and inverting them and getting their determinants. Their codes are super-good at inverting (or, equivalently, providing operators that multiply by the inverse), even the ten-million by ten-million matrices I am going to need to invert, but not necessarily at computing determinants. We discussed and then left it as a homework problem. The context was cosmology, but this problem comes up everywhere that Gaussian Processes are being used.
[This is my 211th research blog post. That's a lot of posts over the last nearly-9 years! I'll be an old man when I post my 212th.]
At the brown-bag talk today, Gruzinov (NYU) talked about modeling pulsars using what he calls "Aristotelian Electrodynamics", which is an approximation valid when synchrotron radiation losses are so fast that charged particles essentially move along magnetic field lines. He claims to be able to compute realistic predictions of pulsar light-curves in the Fermi bandpass, which, if true, is a first, I think. He argued that all pulsars should live in a four-dimensional family, parameterized by two angles (viewing and dipole-misalignment), one spin period, and one magnetic dipole moment. If it all bears out, pulsars might be the new standard candles in astronomy!
In the afternoon, Foreman-Mackey and I went on the BayCEP phonecon of the exoSAMSI group, where we discussed hierarchical inference and approximations thereto. There are various projects close to doing a proper hierarchical probabilistic inference of the distribution of planets in various parameters. Eric Ford (PSU) is even implementing some of the ideas in this old paper.
At breakfast, I went through with Barclay and Quintana (Ames) my list of all the effects that lead to variability in Kepler light-curves. These include intrinsic stellar variability, stellar variability from other stars that overlap the target, stellar variability transferred to the target by electronics issues. They include stellar proper motion, parallax, and aberration. They include variations in spacecraft temperature, pointing, and roll angle. And so on. The list is long! I am trying to make sure we understand what our pixel-level model covers and what it doesn't. I am spending a lot of my writing time on our data-driven pixel-level model getting the assumptions, capabilities, and limitations clearly specified.
While Foreman-Mackey and Barclay set off on a tangent to measure (or limit) exoplanet masses using n-body models of exoplanet systems observed by Kepler, I had a great phone call with Schaefer (CMU), Cisewski (CMU), Weller (CMU), and Lang about using Approximate Bayesian Computation (ABC) to ask questions about the universality of the high-mass initial mass function (IMF) in stellar clusters observed in the PHAT survey. The idea behind ABC is to do a kind of rejection sampling from the prior to make an approximation to posterior sampling in problems where it is possible to generate data sets from the model (and priors) but impractical or impossible to write down a likelihood function.
The reason we got this conversation started is that way back when we were writing Weisz et al on IMF inference, we realized that some of the ideas about how high-mass stars might form in molecular clouds (and thereby affect the formation of other less-massive stars) could be written down as a data-generating process but not as a computable likelihood function. That is, we had a perfect example for ABC. We didn't do anything about it from there, but maybe a project will start up on this. I think there might be quite a few places in astrophysics where we can generate data with a mechanistic model (a simulation or a semi-analytic model) but we don't have an explicit likelihood anywhere.
At the end of the day, Sarah Ballard (UW) gave a great Physics Colloquium on habitable exoplanets and asteroseismology, and how these two fields are related. They are related because you only know the properties of the exoplanet as well as you can understand the properties of the star, and asteroseismology rocks the latter. She mentioned anthropics momentarily, which reminded me that we should be thinking about this: The anthropic argument in exoplanet research is easier to formulate and think about than it is in cosmology, but figuring it out on the easier problem might help with the harder one.
In the morning, to steady our thoughts, Foreman-Mackey, Barclay, and I wrote on the blackboard our model for the Kepler pixel-level data. This is the data-driven model we wrote up in the white paper. The idea is to fit pixels with other pixels, but taking as "other" pixels only those that are far enough away that they can't be being affected by the same star. These other pixels will share in spacecraft issues (like temperature and pointing issues) but not share in stellar variability or exoplanet transit effects, because different stars are independently variable. A key idea of our model, which Foreman-Mackey mocked up for our white paper, is that we avoid over-fitting the pixels by using a train-and-test framework, in which we "learn" the fit coefficients using data not near (in time) to the pixel value we are trying to predict (or de-trend).
In the evening, I started writing up this model and our results. We are all ready to write paper zero on this.
In the morning, Barclay and I decided to divide and conquer: He would write up the scope for a limb-darkening paper, and I would write up a plan for a re-calibration of the Kepler data (using Foreman-Mackey's regression model built for the whitepaper). We both then failed to complete our tasks (there's always tomorrow!). In the afternoon, I discussed large-scale structure measurements with Walsh and Tinker, who are looking at extensions to the halo-occupation model. One extension is to reconsider dependences for halo occupation on other halo parameters (other than mass). Another is to look at more large-scale-structure observables.
Tom Barclay (Ames) found himself in New York this week and we discussed our dormant project to measure stellar limb darkening using exoplanet transits. I love this project, because it is a way to image a star without high angular resolution imaging! We discussed applications for such measurements and (for the millionth time) the scope of our possible first paper. We also discussed many other things, including "what's next for Kepler?", and MCMC troubles, and searches for long-period planets. We also had a telecon about search with Bekki Dawson (Berkeley) and other exoSAMSI-ites. Late in the day, Hou reminded me that I owe him comments on his paper on nested sampling!
In an almost zero-research day, Wendy Freedman (OCIW) gave a great and inspiring talk about measuring the Hubble Constant with local measurements of Cepheid stars and supernovae. She demonstrated the great value of moving to infrared observations and argued convincingly that systematic uncertainties in the measurements are now down around the few-percent level. Of course the interesting thing is that these local measurements consistently get Hubble Constants a tiny bit higher (Universe a tiny bit smaller or younger) than the cosmic-microwave-background-based inferences. Freedman argued strongly that this tension should be pursued and tested, because (a) it provides a fundamental, independent test of the cosmological model, and (b) it could conceivably point to new physics. I agree.
The hypothesis-combination trick mentioned yesterday did indeed work, speeding up Foreman-Mackey's exoplanet search code by a factor of about 30 and keeping the same result, which is that we can detect Earth-like exoplanets on year-ish orbits around at least some Sun-like stars in the Kepler data. Now are there any? This speed-up, combined with the four orders of magnitude Foreman-Mackey got this weekend makes for a total speed up of 105.5, all in one crazy week. And then at lunch, Mike O'Neil (NYU) told us that given the form of our Gaussian Process kernel, he could probably get us another substantial speed-up using something called a "fast Gaussian transform". If this turns out to be true (we have to do some math to check), our exoplanet search could get down to less than a minute per star (which would be dandy).
In other overwhelming-force news, Fadely delivered psf-model fits to thousands of stars in HST WFC3 IR-channel data, showing that we can do a very good job of modeling the pixel-level data in preparation for flat-field determination. And Hou delivered a complete manuscript draft on his diffusive nested ensemble sampler. So it was a great day of hard work by my team paying off handsomely.
I worked on large-galaxy photometry with Patel for part of the day. She is dealing with all our problem cases; we have good photometry for almost any large, isolated galaxy, but something of a mess for some of the cases of overlapping or merging galaxies. Not surprising, but challenging. I am also working on how to present the results: What we find is that with simple, (fairly) rigid galaxy models we get excellent photometry. How to explain that, when the models are too rigid to be "good fits" to the data? It has to do with the fact that you don't have to have a good model to make a good photometric measurement, and the fact that simple models are "interpretable".
In the afternoon, we had a breakthrough in which we realized that Foreman-Mackey's exoplanet search (which he sped up by a factor of 104 on the weekend with sparse linear algebra and code tricks) can be sped up by another large factor by separating it into single-transit hypothesis tests and then hypothesis tests that link the single transits into periodic sets of transits. He may try to implement that tomorrow.
I spent some time with Kilian Walsh (NYU) discussing halo occupation (putting galaxies into dark-matter halos). With Jeremy Tinker (NYU) he is looking at whether enriching the halo parameters (currently only mass and concentration) with environment information will improve halo-occupation models. As per usual, I asked for plots that show how well they are doing without the environment.
In the afternoon, Foreman-Mackey delivered results of a (very limited) search for Earth-like exoplanets in the light-curve of a bright Kepler G dwarf star, using his Gaussian-process likelihood function. In the limited search, he was able to re-find an injected Earth-like transit with a 300-day period. That's extremely promising. It is not clear whether things will get a lot worse when we do a fuller search or go to fainter stars.