cross-validation, exoplanet spectroscopy

We managed to get some research done in storm-ravaged lower Manhattan: Jake VanderPlas (UW) and I continued to work on cross-validation and its relationship to the Bayes integral. I think we nearly have some conditions under which they compute essentially the same thing. Rob Fergus came over to our undisclosed location at lunch time and we discussed the calibration of his spectrophotometric measurements of exoplanets. He finds (not surprisingly) that the complexity he uses for the PSF model affects the detailed shape of the planet spectra he extracts; this happens because the planets are a tiny, tiny fraction of the total light in the image. We discussed how to express this issue in the paper. It is a kind of systematic error, in some sense.


cross validation and Bayes

Jake VanderPlas (UW) had the misfortune to be staying in a NYU guest apartment when Hurricane Sandy hit on Monday, taking out subway and then power and then water. In between fulfilling basic human needs for ourselves and our neighbors, we worked on the relationship between cross-validation and Bayes integrals. I think we have something to say here. It might not be original, but it is useful in understanding the relationships of methods. We both wrote some equations and then tried to develop a concordance today. We started a document. While we sheltered in the only location in lower Manhattan with power and internet, I also spoke by Skype with Lang about flexible sky models for The Tractor. Today ended with a discussion on long-term future discounted free-cash flow, about which I really must write an essay sometime very soon.


matching point sets to point sets

In between seminars by Baldauf (Zurich) and Goldreich (Caltech), I worked on a proposal I intend to send to Andreas Küpper (Bonn) about doing probabilistic fitting of tidal tails. I am trying to get a summary document together and then discuss how to share data and code. With his model for steady-state tidal disruption and my probability foo, I think we can learn more (and more accurately) from the Palomar 5 stream than anyone has before. I hope he agrees!


biomimetics, lucky imaging, statistics, etc

The Physics Colloquium today was by Jasna Brujic (NYU) who talked about biomimetic packing: This is creation of artificial tissue from artifical cells, non-living analogs of biological systems. I learned things about random packing and the statistical properties of heterogeneous materials. In particular, I was interested to learn that although you can hold a (frictionless, perfect) sphere with four points of contact, if you have a frustrated random solid packing of (frictionless, perfect) spheres, each sphere will have, on average, six points of contact. As you remove points of contact, the bulk and shear moduli drop to zero.

I discussed fast imaging (lucky imaging) with Federica Bianco (NYU), Foreman-Mackey, and Fadely. Bianco is interested in data analysis methods that lead to unbiased photometry; this is hard when the PSF is a very strong function of time and space, as it is in fast imaging. We have some ideas, and some code that is half-done.

Guangtun Zhu (JHU) dropped in to discuss recent work. He showed beautiful spectral models of quasars, beautiful average absorption spectra, and beautiful galaxy–gas cross-correlation functions from quasar absorption spectra. We argued that he could, on the one hand, analyze detailed issues with SDSS spectroscopic calibration and, on the other hand, determine the fraction of the cosmic matter density in a range of atomic species. Nice diversity there!


classification, meet sampling

At astronomy meets applied math group meeting (Goodman, Hou, Foreman-Mackey, Fadely, myself) we discussed Hou's insertion of Goodman's stretch move (the basis of our popular product emcee) into Brewer's nested sampling. We think we have some improvements for the method, and Hou is meeting our functional tests, so we are about to apply the method to exoplanet systems. After that we discussed an idea we have been kicking around for a year or so: If a MCMC sampler is stuck in a small number of optima and can't easily transition from one optimum to another, then we should split the parameter space up into regions such that there is only one optimum per region. Then we can sample each region independently and recombine the individual-region samplings into one full sampling. We worked out a method that involves k-means (to do the clustering to find optima) and SVM (to do the splitting of the space). In principle we could make a very general, very flexible sampler this way.


more streams

I spent the afternoon at Columbia, starting with Pizza Lunch, in which the second-year graduate students described their upcoming projects. Price-Whelan described a project with Johnston finding streams among distance-indicating stars. The three of us followed the lunch with a half-hour discussion of this. Even a single good distance-indicator in a cold stream could substantially improve our inferences about the Milky Way. It relates strongly to the fitting discussion we had with Küpper yesterday; with a good probabilistic inference framework, every new piece of data, no matter how oddly obtained, is useful. It also relates strongly to Foreman-Mackey's project to find fainter and more elusive distance indicators.

I then had a long chat with Schiminovich, which ended up with us making some plots that strongly confirm our suspicions that we can do a better job of modeling the GALEX focal plane than we have been so far. Late in the evening I submitted Lang and my mixture-of-Gaussian note to PASP. It will appear on the arXiv this week.


fitting cold streams

I had a great conversation today with Kathryn Johnston (Columbia), Joo Yoon (Columbia), and Andreas Küpper (Bonn) about cold streams of tidal origin. Küpper has results that show many things, including the following: (a) It is possible to model the detailed morphologies of tidal streams quickly without even N-body simulations. (b) The wiggly expected morphologies near the parent body are mainly due to epicyclic motions, not variable tidal stripping. (c) Plausibly the non-uniformities in density in the Palomar 5 stream can be due to these epicyclic effects. (d) There is awesome morphology expected in the position—radial-velocity plane, and Marla Geha (Yale) has the data to (possibly) test it. Nice work and the kind of work that generates multiple questions and projects for every issue it resolves. We discussed better ways to do the fitting of models to data.


the fundamental plane

There were talks today by Jim Stone (Princeton), talking about accretion disk physics from simulations, and Lauren Porter (UCSC), talking about the fundamental plane of elliptical galaxies from semi-analytic modeling. Interestingly, Porter was trying to understand the observations of the dependence of galaxy properties as a function of distance from the plane, not within the plane. It is so intriguing that the FP has been around for decades and never really been explained in terms of galaxy formation. Porter finds that it is natural for the duration of star formation to be related to position off the plane. When not in seminars, I was working on my mixture-of-Gaussians paper.


mixture-of-Gaussian galaxies

I actually did real research today, writing and making figures for Lang and my paper on mixture-of-Gaussian approximations to standard two-dimensional galaxy intensity models (think exponential and de Vaucouleurs). I tweaked the figures so their notation matches the paper, I made figure captions, I adjusted the text, and I got the to-do items down to one day's hard work. I am so close! People: Don't use the de Vaucouleurs profile; use my approximation. It is so much better behaved. Details to hit arXiv very soon, I hope.


station keeping

Mired in bureaucracy, the only research of note happened in long conversations with Goodman, Hou, and Foreman-Mackey on nested sampling, and with Yike Tang on weak lensing, and with Kilian Walsh on measuring the noise in images, and Fadely and Foreman-Mackey on initializing a mixture of factor analyzers.


stars in Stripe 82 and GALEX

Fadely and I continued work with Willman, Preston, and Bochanski today. Preston got Fadely's code working on Haverford computers. Willman found more than ten-thousand possibly relevant spectra in SDSS Stripe 82 to test our star–galaxy classifications. Bochanski found lots of relevant HST imaging and even found that SExtractor has been run on everything we need. We spent lunch planning our next steps. Willman challenged us to make sure we have a very good case that high-quality star–galaxy separation is necessary for real, planned science projects. I think we do, but it was a good reality check on this big effort.

In the afternoon, Schiminovich and Gertler and I talked about obtaining and using SDSS stars for self-calibration of the GALEX photon stream. That's going to be a good idea. We also discussed sources of the sky-to-focal-plane residuals we are seeing; we very much hope they are purely from the focal-plane distortions and not from issues with the spacecraft attitude.


star-galaxy separation and cash

Today is not over but it has already been pretty busy. Beth Willman is in town, with her research assistant Annie Preston (Haverford) and postdoc John Bochanski (Haverford) to discuss next generation ideas on hierarchical Bayesian star–galaxy separation. The idea is to apply our method to SDSS Stripe 82 data to get better star–galaxy separation that what is currently available. We discussed scope of the project and methods for verification of our success (or failure).

In parallel, Rob Fergus and I are in one-hundred-percent-effort mode on a submission for the Google Faculty Research Awards program. It is for small amounts of money but it is a very streamlined (thanks, Google overlords!) process, so I think we can get in a credible proposal. We are proposing to bring self-calibration down to the pixel level and out to the masses.


Miller and machine learning

In the morning Adam Miller (Berkeley) gave a beautiful talk about the data, features, and decision trees of the Bloom-group variable-star classification based on the ASAS data. I know this project well from working with Richards and Long (hey, people, we have a paper to write!), but it was nice to see the full system described by Miller. The technology and results are impressive. The audience (including me) was most interested in whether the automated methods could discover new kinds of variables. Miller had to admit that they didn't have any new classes of variables—indicating that the sky is very well understood down to 13th magnitude—but he did show some examples of individual stars that are very hard to understand physically. So, on follow-up, they might have some great discoveries.

I have criticisms of the Bloom-group approaches (and they know them); they relate to the creation of irreversible features from the data: The models they learn from the data (in their random forest) are generative in the feature space, but not in the original data space. This limits their usefulness in prediction and subsequent analysis. But their performance is good, so I shouldn't be complaining!

In the afternoon, Fadely and I figured out a degeneracy in factor analysis (and also mixtures of factor analyzers). We discussed but see no serious discussion of it on the web or in the foundational papers. We certainly have useful things to contribute about this method in practice.


speeding up code

In a low-research day the most fun I had was working in pair-code mode to speed up Fadely's mixture of factor analyzers code, now published on PyPI. Most of our speed-up came from realizing that (somewhere inside a pretty complicated expression)
c = numpy.diag(a[:, None] * b[None, :])
(where a and b are one-dimensional numpy arrays of the same length) is exactly equivalent to
c = a * b
Duh! That embarrassment aside, I think we have some pretty useful code now. Time to test it out. Please help!


time for Astrometry.net to rise again

I had two conversations about Astrometry.net today. The first was with Oded Nov (NYU Poly), with whom I plan to put in an engineering proposal to work on citizen science and distributed, untrusted sensors, using Astrometry.net and something about urban sensing as examples.

The second conversation was with Kilian Walsh (NYU), with whom I am trying to get up and running a system to synthesize every submitted image right after we calibrate it. The idea is that the image synthesis (plus a healthy dose of optimization) will tell us, for each submitted image, the sky, bandpass, zeropoint, large-scale flat, and point-spread function. If we can get all that to work (a) we will be Gods on this Earth, and (b) we will be able to search the data for variable stars, transients, and other anomalies.


mixture of factor analyzers!

At computer-vision group meeting this morning, Fadely, Foreman-Mackey, and I tri-coded Fadely's mixture of factor analyzers code and got it working. This is no mean feat, if you look at the linear algebra involved. And all that before noon!

The MFA model is like a mixture-of-Gaussians model, but each Gaussian has a variance tensor with reduced degrees of freedom: Each variance tensor is a sum of a diagonal (noise) tensor plus a low-rank general tensor. It is like a generalization of PCA to build a probabilistic distribution model for the data, then generalized to a mixture. It is a very powerful tool, because in large numbers of dimensions (for the data space) you get almost all the power of mixture-of-Gaussian modeling without the immense numbers of parameters.


the PSF and its variations

With Hirsch (UCL & MPI-IS) and Foreman-Mackey, I spent the whole morning discussing next-generation systems to estimate the point-spread function in heterogeneous data. Hirsch has developed a beautiful model for spatially varying PSFs, using data we took (well really he and Schölkopf took) in Valchava as the test data. We discussed various possible directions to go: In some, he works towards making a model of the PSF and its variations over collections of images from the same hardware. In others, he works towards setting PSF model complexity parameters using the data themselves. In others he models departures from smooth, parametric forms for the PSF to increase accuracy and precision. We concluded that if we can make some useful software, it almost certainly would get adopted. We also discussed integration with Astrometry.net, where we want to move towards image modeling for final calibration (and anomaly discovery!).

In the afternoon, Matias Zaldarriaga (IAS) gave a talk about measuring and understanding fluctuations in the Universe better than we can at present. In one project, he showed that we can use certain kinds of distortions away from black-body in the CMB to measure the amplitude of fluctuations on extremely small scales—scales too small to observe any other way. In another, he showed that you can do fast, precise numerical simulations by simulating not the full universe, but the departure of the universe away from a linear or second-order prediction for the growth of structure. That made me say "duh" but is really a great idea. It also gave me some ideas for machine learning in support of precise simulations, which perhaps I will post on the ideas blog.


non-negative again, pixel-level self-calibration

Michael Hirsch (UCL & MPI-IS) arrived for two days at NYU. We talked about self-calibration and blind deconvolution. On the latter, I was arguing that many things people usually do in computer vision might not work for astronomy, because astronomers expect to be able to make measurements (especially flux measurements) on their processed images. Some computer vision methods break that, or make measurements highly biased. On that point, I did my usual disparage of non-negative. Like Schölkopf, Fergus disagreed: If we think the fundamental image-formation mechanism is non-negative, then non-negative is the way to go methodologically. I think there might be a problem if you impose non-negative but not, at the same time, other things that are similarly informative that you know about the imaging. Anyway, we left it that I would make a fake data set that obeys exactly the image formation model but still leads to badly biased results when standard blind deconvolution is applied to it. That would be a service to this endless argument.

We also thought more and argued more about the idea that Fadely's brain-dead model of tiny patches of SDSS imaging data could be used for self-calibration purposes. We have a rough plan, but we are still contemplating whether the calibration and the data model could be learned simultaneously.


hierarchical models for GALEX and lensing

The graphical model I posted yesterday for the GALEX Photon Catalog is, in fact, wrong. I worked on it more today, and will post a better version as soon as I start to really understand Schiminovich's comments about the part of the PSF that comes from the electronics: The reason that GALEX has a broad PSF is in part electronic!

In other but conceptually related news, NYU graduate student Yike Tang (who started working with me recently) has been able to show that building a joint hierarchical model of galaxy shapes and the weak-lensing shear map permits much higher precision (or much higher spatial resolution) shear maps than what you get by doing averaging of galaxy shapes. This idea has been kicking around for a while but I think (and, despite my open-science polemics, hope) that we are the first to actually do it. Much debugging remains, and we are very far from having a paper, but we may just have made LSST and Euclid and DES much more powerful at fixed cost!


exoplanet spectra; graphical model for GALEX

In the morning at our computer-vision group meeting, Fergus told us about his detection of and spectral extraction for three of four companions of HR8799, the poster child for young planetary systems. This is a huge success for his methods (which are computer-vision based, flexible linear models) and for the P1640 spectrographic coronograph (Oppenheimer PI). He is pretty stoked about the success—his software has been game-changing for the P1640 project—as is Oppenheimer, who wants to publish asap.

In the afternoon, at the insanity that is the Leroy St New York Public Library for children, I made the graphical model (below) using Daft. Based on a two-hour conversation with Schiminovich, it is supposed to represent the data-generation process (or, perhaps, causal model) for the photon stream we have from the GALEX spacecraft. I think it still has some errors.


computer vision and self-calibration

I spoke in the CCPP Brown Bag today, about computer vision and its connections to cosmology. I realized only during the talk (and therefore way too late to change course) that really the best connection between computer vision and cosmology to emphasize right now is in the area of self-calibration. In computer vision, it is often the case that you have to determine the distortion or noise or blurring kernel of the image using that image alone; that is, you have no external calibrating information. This is also true for precise astronomical data sets: The best information about how the instrument works comes from the science data themselves. Oh well, I will do a better job next time! I can also then connect the stuff I am doing with Fergus and Fadely with the things I am doing with Foreman-Mackey and Holmes.
Inspired by all this, I spent part of the afternoon making a huge ginormous graphical model with Daft for all of astronomy.