Today was cosmology day at the meeting, with (among other interesting contributions) a nice discussion by Marinucci of needlets, a class of compact wavelets on the sphere. These have extremely nice properties for statistical analyses of fields on the sphere. Once again I am amazed at all the great things you can do with linear functions of your data! In off time, I spent some time getting very specific advice from Baines about MCMC improvements.
Today was the second day of the Statistical Frontiers of Astrophysics meeting in Tokyo. Among other very interesting talks, Xiao-Li Meng and Paul Baines worked us through some ideas in the modern use of MCMC. They are particularly interested in problems of data augmentation, where parameters are introduced for every data point (that is, there are more parameters than data points). They had a number of basic ideas for MCMC that are almost always a good idea; I have lots to do to my code when I get home (and lots of new literature to read).
Sam Roweis spoke at the NYU Computer Science Colloquium, and Lam Hui spoke at the NYU Astrophysics Seminar today. Roweis spoke about fitting models to data, where the data are taken by sensors with unknown properties (for example microphones with unknown thresholds, saturation, nonlinearity, and frequency response or sensors on wandering robots of unknown position). His point was that if you have enough sensor readings from differently unreliable or unknown sensors, you can still do very well, if you take a generative modeling approach. The demos were nice. Hui spoke about modifications to gravity and what they might do to dynamics. In particular, he noted that most current modifications to gravity look very much like a scalar–tensor theory, where there is (effectively) a different value of G in high-density regions than in low-density regions. If this is going on in our Universe, there ought to be lots of dynamical signatures; hopefully we can rule out large classes of models of that type.
The most attentive of readers would know that Price-Whelan and I have been working on assessing the bandwidth necessary for transmitting astronomical image information. That is, we have determined the precision to which you have to record pixel values to preserve all the scientific information in an image, where we assess that information for both bright sources and faint sources
below the noise. Today we pulled down a RAW (unprocessed) HST ACS image from the HST Archive and showed that, in fact, the ACS raw data are telemetered within one bit of the minimum bandwidth. That is, of the 16 bits in the integer representation of raw ACS pixel data, at most one bit is wasted. That's pretty close to optimal, and we will say so in the paper we hope to submit within a week or two.
Spent time with Johnston today at Columbia and she encouraged me to dust off my manuscript on perturbations of cold streams by compact substructures. We realized that there are already morphological features in the known streams that could be analyzed, at least roughly, in terms of perturbations by substructures. I started to get her to agree that the smoothness and straightness of streams like GD-1 already make them interesting for the dark-matter model. But we both agreed that we can't be quantitative about that until we understand how the disruption of streams by substructure affects their detectability.
I participated in Ronnie Jannson's PhD defense today. He has tested Galactic magnetic field models, and their implications for ultra-high-energy cosmic rays. He showed that none of the GMF models are reasonable, and that there is a substantial probability that a large fraction of the UHECRs come from a small number of sources. Great stuff and congratulations, Dr Jannson.
I spoke about decision making in the CCPP brown-bag lunchtime talk today. The crowd was very suspicious about the point that the number of data points and number of parameters are both irrelevant or nonexistent quantities, except in the case of linear least-squares fitting. No other research to report, as I spent the day preparing for class and a thesis defense.
On Sunday, Lang came into NYC and we discussed his projects involving imaging data pipelines and the photometry and deblending of galaxies. Lang would like to make deblending a probabilistic operation, but this requires (in our outlook) a generative model of a galaxy image—a model that is good at the pixel level! This doesn't exist; exponential and de Vaucouleurs models are not
good fits to the data in any sense of that term. There are shapelets and sechlets, but these (on the other hand) are pixel level with too much freedom. So we came up with the simplest possible model: other galaxies. Can we find a small number of galaxies that act as archetypes to fit other galaxies, and then use those for probabilistic deblending? We shall see.
Schiminovich came to NYU for the day and we talked Fireball and GALEX and the quasar–photon cross-correlation. If we could detect this angular cross-correlation, could we separate it into scattering, reflection, recombination, and star-formation components? Once again we came up with many new projects to do with GALEX; but we should probably just finish the ones we have started!
Sam Roweis has just moved from Google to NYU to start as a tenured faculty member. Congratulations! And, more important, congratulations to me, because this is one of the best things to happen in my work life! We meet every Thursday to talk and work; today we discussed, among other things, some of the questions that came up in Lang's thesis defense about how data-analysis pipelines could adapt to the data they have seen and, effectively, learn. The ideas or settings or choices we put in at the beginning would just be
initial conditions that get replaced with objective knowledge generated by operation.
I realized (duh) that there is a huge literature out there on MCMC engineering, and I tried to read a bit of it. There are lots of ways I ought to be able to speed up my exoplanet MCMC (and all our other MCMCs) with these tricks. Of course they all make the code much less understandable, so there are trade-offs. I discussed MCMC in general with Price-Whelan, who might start doing some research on the subject, and the tricks with Bovy, who is our resident sampler.
In many of the problems we are interested in, we would like to have a chain or sampling of models that are consistent with all data so far and then, as new data come in, we would like to trim the chain of models that the new data
rule out (really disfavor), and then extend the chain with new models that are consistent with everything. I have a strong intuition that we can do this without re-starting from scratch. I discussed this with Lang, and a short-term goal is to convert the exoplanet project over to this mode.
This idea is not unrelated to the fact that as new data come in, you can treat them as an entirely new experiment, for which the prior probability distribution is the posterior probability distribution from the previous data.
I spent airport time yesterday and morning time today making human-readable output from my exoplanet discovery code. I realized that by far the best way to see that the MCMC has converged is to show that there are no correlations between any parameter and the link number. I confirm this visually, but this could be tracked quantitatively; a great upgrade for the next generation of code. I also thought—and spoke with Roweis—about model selection, which everyone treats as trivial, but certainly is not.
It is with tears in my eyes that I report the intellectually exciting and (obviously) successful PhD defense of my friend and student Dustin Lang today at Toronto. It was truly an old-school defense, with a lively debate among the committee (half astronomers and half computer scientists) and the candidate, filled with deep discussion of the problem at issue (Astrometry.net and all its engineering and scientific aspects) and also the problems of the future (roboticising, learning from, and making more principled all of astronomy and the physical sciences). Lang has only scratched the surface of this, of course, but it is a deep scratch he has made. I also learned, once again, even more clearly than ever before, that astronomers and computer scientists have a lot to talk about. The heartiest congratulations are due to Dr Lang.
Later in the day I spoke at CITA, which is just about the liveliest place for theoretical astrophysics in the world.
One of the reasons I am interested in the exoplanet radial velocity stuff is that I am interested in model complexity: How to compare, combine, and decide among models of different complexity. I realized that the correct hierarchy of complexities is not based purely on the number of planets, but also on whether or not they are permitted to travel on orbits of nonzero eccentricity. I modified my code to respect this additional level of complexity and am re-running it all.
I had conversations today with Wu, who is responding to the referee report on her paper on Spitzer observations, and Bovy, who is responding to comments we solicited on his paper on the local standard of rest. In both cases the comments are very constructive; both papers will be much improved by responding to them.
I failed to admit in the previous post that I was fitting the radial velocity data with pure sinusoids, not proper planet-induced radial velocity models. I switched to proper models today, after taking a short refresher on the mean anomaly, eccentric anomaly, and true anomaly. I wrote (not completely) stupid python code to do all these things and got my MCMC running again. So far, it looks like it might work. Of course I am reinventing the wheel here.
Inspired by Schwab (Sternwarte Heidelberg), I spent all of Friday and a chunk of the weekend fitting models to precise radial velocity data on stars measured for the purpose of discovering exoplanets. Scwhab came to me because I had expressed confidence that an MCMC approach would be not just useful but necessary. Having said this, I had to make it work.
As always, the issue is with initialization, search, and convergence of the MCMC algorithm. The algorithm is simple and provably correct, but the proofs don't tell you how long it will take to converge to a fair sampling of the posterior distribution function. Furthermore, that convergence is a very strong function of choices you make about stepping (directions and sizes), and there is (at present) no objective way to set that. Indeed, this is a great area of research, and there are probably results there I don't know about.
One thing I did, which worked extremely well, is implement a trick suggested by Phil Marshall: I started the MCMC working on the prior alone, and slowly increased the relative weight of the likelihood, so that only after a long burn-in period was the system optimizing the true posterior. That worked extremely well. More on all this later, because all this recommends working out and writing down some
lessons learned about MCMC in practice.
[I broke the rules this week by posting only four posts. That's because on Wednesday, I got nothing done. Unfortunately, the rules demand that I admit this.]
Tao Jiang (NYU) brought me back up to date on his project to measure very small-scale galax–galaxy cross-correlation functions in the SDSS Main sample (that is, normal, low-redshift galaxies). His goal is to measure the merger rate as a function of mass, color, and mass ratio.
In the rest of my research time today I worked on some crazy celestial mechanics ideas; conversations with Christian Schwab (Heidelberg Sternwarte) got me thinking about how one explores the space of possible model fits to exoplanet radial velocity data.
I started writing something today for the upcoming symposium in honor of Roger Blandford (my PhD advisor). What I am working on is a bit off the rails; I am supposed to be talking about cosmology, but I might use the opportunity to talk about scientific reasoning, since that has always been one of Blandford's hobby interests.