The scariest thing about Hallowe'en this year was how little research I got done! Maryam Modjaz (NYU) gave a nice blackboard talk at lunch about the supernova zoo, and the hopes we have for indirect determination of stellar and binary progenitors. For prompt supernovae, local gas-phase metallicity could be extremely informative; however, you often have to wait years to get a good spectrum. In the afternoon I had a short conversation with NYU undergraduate Gregory Lemberskiy, who is looking at some issues in image model initialization and optimization.
Phil Marshall and I spent the day pair-coding a tiny Python package that fits a set of mixture-of-Gaussian models of increasing complexity to a PanSTARRS-produced catalog cutout image. It is insanely fast! (We did this outside of Lang's Tractor for reasons that are too annoying to explain here.) Our success means that in principle, any survey that is doing any kind of galaxy fitting can also, very easily, fit general non-parameteric models to everything: That is, we could have a variable-complexity description of every imaging patch. I love this just because I am a geek. Marshall loves this because maybe—just maybe—the mixture-of-Gaussians description of a patch of imaging will be close to a compact sufficient statistic for performing other kinds of fits or analyses. He has weak and strong gravitational lensing in mind, of course. More on this soon, I very much hope!
[In other news, Brian Cox thinks I am guilty of scientific misconduct. Fortunately, Sarah Kendrew (MPIA) doesn't.]
Marshall showed up for a couple of days of hacking. He appealed to my sense of irresponsibility to convince me not to work on much more pressing things, like letters of recommendation, mid-term exams, hallowe'en costumes, and other matters that don't constitute research (consult rules at right). We discovered that the PanSTARRS preliminary catalogs might not have the angular resolution (or, perhaps, deblending aggression) we would like in order to use them to find gravitational lenses. The experience will lead to feedback that will (we hope) improve the PanSTARRS software, but they remind me of the great opportunities that could be afforded if we had a probabilistic framework for producing catalogs from imaging. I should work on that!
It is Open Access Week and for that reason, SUNY Albany libraries held an afternoon-long event. I learned a lot at the brown-bag discussion about how open access policies could dramatically improve the abilities of librarians to serve their constituents, and dramatically improve the ability of universities to generate and transmit knowledge. The horror stories about copyright, DRM, and unfair IP practices were, well, horrific. In the afternoon I gave a seminar about the openness of our group at NYU, including this blog, our web-exposed SVN repo, and our free data and code policies (obeyed where we are permitted to obey them; see above). It was great, and a great reminder that librarians are currently—in many universities—the most radical intellectuals, with sharp critiques of the conflicts and interactions between institutions of higher learning and institutions of commerce.
On the train home, I tried out importance sampling for my
posterior PDF over catalogs project. Not a good idea! The prior is so very, very large.
In separate conversations with Hou and with Foreman-Mackey, I found myself discouraging each of them from looking into serious sparse methods for Gaussian processes. Both students are potentially matrix-inversion-limited (or determinant-computation-limited). We could definitely benefit from having matrix respresentations (even if it means approximations) that have sparse inverses (or analytic inverses). But going there is (at this point) a research project in sparse methods, not an astronomy project. So I am going to hold off until we really need it. That said, there is a huge literature now of very promising techniques.
Seminars filled my research time today. At lunch, Sergei Dubovsky (NYU) talked about conjectures for the fundamental action of strings in string theory. He gave a tiny bit of motivation for why 26 dimensions is the preferred dimensionality of spacetime in string theory. Apparently the 26 dimensions are expected to be 25 spacelike and one timelike, which seems odd to me, but causality is a bitch (as they say). Causality featured prominently in his talk, because he is experimenting with different causal structures for the dynamics on the worldsheet of the fundamental string.
In the afternoon, Michele Vallisneri (JPL) told us about detecting gravitational radiation with LISA-like missions. They face there the same issue that Lang and I have been talking about for astronomical imaging: It is no longer possible to think about there being the catalog of sources from the data. There will always be many qualitatively different explanations of the data stream, and always secure scientific conclusions will have to be based on marginalizations over that catalog-space probability distribution function. He said interesting things about the
hardware injections of false events into the LIGO data stream to stress-test the analyses and about the software engineering aspects of these huge projects.
Lang and I spent the weekend at the Googleplex for the Google Summer of Code Mentor Summit. We did some work on our source detection paper, and had a lot of conversations about open-source software. Things I learned of relevance to astronomers include: The Software Freedom Conservancy (and other organizations that are similar, like SPI) can provide a non-profit umbrella for your project, making it possible for you to raise money for your project as a non-profit organization. The commercial movie industry uses a lot of open-source computer vision and graphics (which is really physics) open-source software, even in blockbuster films (like Smurfs). The Climate Code Foundation is trying to do some of the things advocated in Weiner's white paper, but for climate science. The
semantic web dream of many an astronomer (though not me; I am suspicious of all meta data) has been realized in the music space, in open-source projects like MusicBrainz.
Highlights for me today included El Ghaoui (Berkeley) talking about text document clustering and searching, and Das (NASA Ames) talking about sparse matrix methods in Gaussian processes. In the former, I realized it is incredibly easy to make sparse models of authors' use of words; I have always wanted to launch a project to figure out, in multi-author papers, which authors wrote which paragraphs. If you want to do that project with my consulting, send me email! In the latter, I learned about some inferential techniques for finding sparse approximations to the inverses of sparse matrices. In general the inverse is not sparse if the matrix is sparse, but you can come up with sparse approximations to the matrix and inverse pair. I spoke today, about my comprehensive image modeling projects. I also flashed some of Fergus's results.
Day two (but day one for me) of the 2011 Conference on Intelligent Data Understanding at Mountain View was today. Many of the talks are about airport safety and global climate. I learned, for example, that there are far more forest fires going on in Canada and Russia than in past years, probably as a result of global climate change. Highlights of the day include comments by Basu (Google) about the issue that useful algorithms have to be simple so that they can be scaled up and run on the highly parallel cloud (MapReduce and all that), and comments by Djorgovski (Caltech) about needing to take humans out of the loop for transient follow-up. On both points, I agree completely. At the end of the day there was a panel discussion about data-driven science and interdisciplinary data science with me as one of the panelists. I opened my discussion about how we work successfully with applied mathematicians (like Goodman) by saying that I can't understand any paper that begins with the word "Let".
As my loyal reader knows, Fergus and I have been working on data from Oppenheimer's (AMNH) 1640 coronograph. Fergus's model is a data-driven, empirical model of the speckle pattern as a function of wavelength, informed by—but not fully determined by—the physical expectation that the pattern should grow (in an angular sense) with wavelength. Fergus's model is very simple but at the same time competitive with the official data pipeline. Nonetheless, we had to make a lot of decisions about what we can and can't hold fixed, and what we can and can't assume about the observations. We resolved many of these issues today in a long meeting with Oppenheimer and Douglas Brenner (AMNH).
Complications we discussed include the following: Sometimes in the observing program, guiding fails and the star slips off the coronograph stop and into the field. That definitely violates the assumptions of our model! The spectrograph is operating at Cassegrain on the Palomar 200-inch, so as the telescope tracks, the gravitational load on the instrument changes continuously. That says that we can't think of the optics as being rigid (at the wavelength level) over time. When the stars are observed at significant airmass, differential chromatic refraction makes it such that the star cannot be centered on the coronograph stop simultaneously at all wavelengths. The planet or companions to which we are sensitive are not primarily reflecting light from the host star; these are young planets and brown dwarfs that are emitting their own thermal energy; this has implications for our generative model.
One more general issue we discussed is the
obvious point made repeatedly in computer vision but rarely in astronomy that astronomical imaging (and spectroscopy too, actually) is a bilinear problem: There is an intensity field created by superposing many sources and an instrumental convolution made by superposing point-spread-function basis functions. The received image is the convolution of these two unknown functions; since convolution is linear, this makes the basic model bilinear—a product of two linear objects. The crazy thing is that any natural model of the data will have far more parameters than pixels, because the PSF and the scene both are (possibly) arbitrary functions of space and time. Astronomers deal with this by artificially reducing the number of free parameters (by, for example, restricting the number of basis functions or the freedom of the PSF to vary), but computer vision types like Fergus (and, famously, my late colleague Sam Roweis) aren't afraid of this situation. There is no problem in principle with having more parameters than data!
Only conversation today, making me worry about my ability to still actually do anything! Gabe Perez-Giz (NYU) and I spent an inordinate amount of time talking about how GPS works, and whether recent claims about systematic error in the OPERA neutrinos could be correct. We also talked about his work on classifying very odd test-particle orbits in the Kerr spacetime.
In the afternoon, Goodman, Hou, Foreman-Mackey, and I talked about Gaussian Processes, among other things. Hou's model of stellar variability is a very restricted (and physically motivated) GP, while Foreman-Mackey's Stripe 82 calibration model involves a (not physically motivated) GP for interpolation and error propagation. Goodman pointed us to the idea of the copula, which apparently destroyed the world recently.
Not much research got done today, but I did have a nice long chat about instrumentation and calibration with Nick Konidaris (Caltech), who is building the SED Machine, and attended a blackboard talk on relativistic turbulence by Andrew MacFadyen (NYU).
I spent the morning writing in Tsalmantza and my HMF paper. This is my top priority for October. In the afternoon, Feryal Ozel (Arizona) gave a great talk about getting precise neutron-star mass and radius information to constrain the nuclear equation of state at high densities. She is doing very nice data analysis with very nice data and can rule out huge classes of equation-of-state models. She also showed some nice results on neutron-star masses from other areas, and (after the talk) showed me some hierarchical inferences about neutron-star mass distribution functions as a function of binary companion type.
Maayane Soumagnac (UCL) visited for a few hours to discuss her projects on classification and inference in the Dark Energy Survey. She is using artificial neural networks, but wants to compete them or compare them with Bayesian methods that involve modeling the data probabilistically. I told her about what Fadely, Willman, and I are doing and perhaps she will start doing some of the same, but focused on photometric redshifts. The key idea is to make the galaxy type, luminosity, and redshift priors hierarchically; that is, to use the data on many galaxies to construct the best priors to use for each individual galaxy. Any such system makes photometric redshift predictions but also makes strong predictions or precise measurements of many other things, including galaxy metallicities, star-formation rates, and properties as a function of redshift and environment.
One of the things we discussed, which definitely requires a lot more research, is the idea of hybrid methods between supervised and unsupervised. Imagine that you have a training set, but the training set is incomplete, small, or unreliable. Then you want to generate your priors using a mixture of the training data and all the data. Hierarchical methods can be trained on all the data with no supervision—no training set—but they can also be trained with supervision, so hierarchical methods are the best (or simplest, anyway) places to look at hybrid training.
I got up exceedingly early in the morning, highly motivated to write a short theory paper—that's theory of data analysis, of course—about the posterior probability distribution over catalogs. I have become motivated to write something like this because I have started to become concerned about various bits of wrongness in my old paper with Turner on faint source photometry. The principal results or conclusions of the paper are not wrong, but the language is all wrong (the terms likelihood and measurement are used wrongly) and the priors are improper. I asked Phil Marshall what you do about a paper that is wrong; he said:
Write a new paper correcting it!
One of the key things that has to be fixed in the problem is that the explanation of an image in terms of a catalog is—or should be—properly probabilistic. That means that if the image is not high in signal to noise, there are necessarily many even qualitatively different catalogs that could explain the image. This means describing or sampling a posterior distribution over models with varying complexity (or number of parameters or number of sources). That's not a trivial problem, of course.
The nice thing, if I can do it all, is that the new paper will not just resolve the issues with Hogg & Turner, it will also generalize the result to include positional and confusion noise, all in one consistent framework. The key idea is that for any source population you care about (faint galaxies, say, or Milky-Way stars), it is very easy to write down a proper and informative prior over catalog space (because, as Marshall often reminds me, we can simulate imaging data and the catalogs they imply very accurately).
After giving a seminar at the Cavendish lab in Cambridge UK by videocon, I spent most of my research day talking with Fergus, discussing his model of residual speckle patterns in coronograph images from Oppenheimer's group. Fergus's model is extremely general, but has strong priors which "regularize" the solution when there isn't much data or the data aren't troublesome. Because it is so general, the model is in fact very simple to write down, has analytic derivatives, and can be optimized quickly by some (pretty clever) Matlab code. We decided that he should run on everything and then we should meet with Oppenheimer's group to decide what's next. I think we might be able to improve the sensitivity of coronographs generally with these models.
Roman Rafikov (Princeton) spent the day at NYU and we had long discussions about the state and future of exoplanets. This in addition to his nice talk on the subject. I realized that between the image modeling I am doing with Fergus, the transit discoveries with Schiminovich, and the MCMC work with Goodman and Hou, more than half my current work is related to exoplanets.
I don't think I can count a nice lunch with Christopher Stumm at Etsy as research. But I did meet up with Schiminovich afterwards to figure out how we are going to write the papers we need to write. I am so stoked to have the full list of time- and angle-tagged photons from the full observing history of GALEX.
The highlight of my research day was a long conversation with Hou and Goodman about stars, stellar oscillations (linear and non-linear), modeling those with Gaussian processes and the like, and next-generation Markov-Chain Monte Carlo methods. On the latter, the idea is to use an ensemble sampler (or a pair of them) to perform very high-quality proposals, for applications where posterior (likelihood or prior) calls are espensivo.
I didn't do much research today, but I did remind myself how to get a new Mac computer working. The answer: Fink. As much as I hate compiling everything from scratch (which fink does), the pre-built binaries never work on new hardware or operating systems, so I found that all shortcuts I wanted to take were useless. The key commands were:
sudo fink selfupdate
sudo fink update-all
sudo fink install numpy-py27
sudo fink install scipy-py27
sudo fink install matplotlib-py27
sudo fink install texlive
sudo fink install imagemagick
(All these run after installing Xcode from the App store.) Setting up a new computer is annoying and a headache, but actually, I think maybe it is research.