optics and data

I spent the afternoon at Leiden chatting with the locals. Among other things, I learned that Kenworthy (my host) has figured out how to use the real-time output from an adaptive-optics wavefront sensor to predict the point-spread function in the science data. This is a big deal. It means we could potentially have a model of the PSF before we even read out the data. The wavefront-sensing data are used to control the adaptive optics system but of course they also contains scientific information of great value. That point is simultaneously obvious and brilliant.

Labbé and I discussed alternatives to stacking—simple adding or averaging of data—when there are signals too weak to see in any one datum. He has written some nice things about this issue but also has the pragmatic view that doing a better job only makes sense if you get a better answer. We discussed contexts for making that point.

On the way to dinner, Kuijken described his approaches to weak lensing problems, which involve finding the shear map that distorts the image into one in which there is no net ellipticity. He made the nice point that it is much easier to show that a group of galaxies shows no net ellipticity than it is to quantitatively measure the departure from zero ellipticity. That's an interesting point and worth contemplating. He still has to deal with the point-spread function, which has not been lensed, so his methods are not trivial; we discussed the problem of the PSF because we have some disagreements there.


Gaia attitude and residuals

I had long conversations with Anthony Brown (Leiden), Daniel Risquez (Leiden), and Giorgia Busso (Leiden) about Gaia data processing and catalog output. I learned a lot, not limited to: Risquez is looking at the changes in attitude modeling precision as the model complexity is changed, where the model complexity is set by the time spacing in knots of a bspline model. The Gaia attitude model is completely data-driven (it has no sense of torques or moments of inertia, it just knows about transit times of stars). It is also not trying to know the true attitude, but the effective attitude averaged over 4.4-second intervals (because that is the CCD drift-scan time). That leads to interesting subtleties and constraints, one of which is that knot spacings much smaller than 4.4 seconds (or maybe half that) can never be useful.

Busso is working on the charge-transfer-inefficiency model for the CCDs. As the CCDs get damaged, they develop traps which delay a small fraction of the charge in the CCD drift-scan. This slightly moves the stellar centers, by much less than a CCD pixel but much more than the required precision (and in a stellar-brightness-dependent way)! It sounds impossible, but because Gaia cuts through every star at 80 or more different angles at 80 or more different times, the magnitude-dependent and time-dependent CTI effects can in fact be modeled and fit to restore the precision of the instrument. The nice thing is that the collaboration hopes to be able to make an empirical model of the CTI and its evolution; that reassures me because the theoretical models of CTI are both young and simplistic.

Brown, who I think (looking from the outside) has had lots of great influence on the Gaia collaboration, told me that early data releases from Gaia are now more-or-less promised. That's a big and important thing; if you think about how much SDSS learned from its early (and often wrong) data releases and how much Hipparcos benefitted from its complete post-final-release reanalysis, early data release is essential to the production of the best possible catalog. Brown also intrigued me by saying that it was likely that the releases would include all the timing residuals for every star at every transit. This is exciting to me because these residuals could be used to create an approximate Gaia likelihood function as I have been trying to imagine for some time now.


information theory and phase space

At Groningen (my first stop on my trip), I chatted with Robyn Sanderson (Groningen) about finding substructure in current SDSS-II and future Gaia stellar kinematic data. The idea we quickly worked out—which maybe is in the literature already—is to ask a general information-theory-like question about the phase-space distribution function in something like action space, and then tune potential parameters to make the answer to that question as informative as possible. As a straw man idea, I suggested the K–L divergence between the action distribution and a maximum entropy distribution with the same low-order moments. The thing I like about this plan is that it might not be very sensitive to the observational noise. That is, the scalar will be sensitive to it, but the optimum in parameter space might not. I said might. It's worth a try.


toy Gaia catalog

On my way to the Netherlands, and in preparation for chatting with the Leiden Gaia group, I brushed off and got working again my toy Gaia mission, in which a zero-dimensional spacecraft surveys a one-dimensional sky. The spacecraft is battered by angular-momentum-changing collisions and scans the one-d sky clockwise and counter-clockwise. The toy is designed to test ideas about fitting charge-transfer inefficiency models and spacecraft attitude models. Still not sure why I am doing either of those things; I suppose it relates at least slightly to my GALEX photon projects with Schiminovich.


stretching images for human viewing

What I call the "stretch"—the relationship between the data numbers in a data image and the byte values in a human-viewable JPEG or PNG file—is a constant problem for us, in The Thresher, in the Atlas, in the Tractor, and in Marshall's new Lens Zoo project. I spent time today with Foreman-Mackey writing hacky but very robust code for doing this on the fly on the in-progress status plots in The Thresher. The code is ugly! The problem is that you want to see the range of the data but you also want to see the noise level; at the same time, you don't know the noise level in advance and you can hit images (think clipped or sparse images) where it is quite hard to even determine the noise level. Foreman-Mackey and I solved this problem by making multiple noise estimates and using relationships among them to find the best one.


The Thresher

We had a big push on our post-lucky imaging pipeline. As part of this, we renamed it The Thresher and Foreman-Mackey made a sweet github-hosted web page for it. We are very close to release, and over lunch we discussed last things to do before writing paper zero. Paper zero will compare The Thresher to traditional lucky imaging, showing that we win in signal-to-noise at fixed resolution, and win in resolution at fixed signal-to-noise. No smoothing applied! We fit the best band-limited scene to all the data. At Camp Hogg, We Don't Throw Away Data (tm).

[By the way, in case you think a thresher throws away husks, you might be confusing a thresher with a winnower. Indeed, if I ruled the world, I would rename traditional lucky imaging pipelines winnowers.]


The Atlas

I spent a full day on the Sloan Atlas of Galaxies, working on galaxy surface-photometry measurements, sample selection, and plate layout. Damn, those galaxies are beautiful!


Gelman, photons

While the GALEX photons are winging their way to NYC by FedEx (yes, don't underestimate the bandwidth of a station wagon full of hard drives), Schiminovich, Foreman-Mackey, and I took advantage of an opportunity to spend some quality time with Andrew Gelman, stats blogger extraordinaire. We discussed, among other things, the problem of sampling in spaces of high degeneracy. The conversation ended with Gelman requesting that we get specific by writing a density function in C++ and he will fire it into STAN. This relates to our MCMC High Society challenge, though we will start with an easier problem. It also relates to the plan (hatched at CMML at NIPS) to create testbeds or challenge problems that are simple enough and documented such that they can be shared across disciplines.

In the rest of the day, we did our usual (that is, discuss everything under the Sun, including cosmological tests with Lam Hui, who we found at the coffee shop) and noted that brilliant work by the GALEX team has got us a lot further along our project of modeling the GALEX instrument response, spacecraft pointing model, and astronomical source variability using all the individual (that is, not co-added) photons.


plates, all-sky maps

In the morning, Malyshev (Stanford) and I discussed next-generation projects for modeling all-sky maps from WMAP, FERMI, and etc. using data-driven models like HMF. We also talked about priors as regularizers for problems with large numbers of parameters, and the point that if we could model the point sources better, we would do a better job on the diffuse emission. That is, right now Malyshev cuts out the point sources, but those are the detected point sources; his model doesn't take into account the fainter sources too low in S/N to be confidently identified. In the afternoon, I worked on making a first set of trial plates for my Atlas.


SVD, factorization

In unrelated but oddly coincident operations, I spent the day pair-coding a PCA code (yes, I know, I hate PCA; but it sure works well) with Perez-Giz (making use of the scipy SVD) and then listened to a nice seminar by Dmitry Malyshev (Stanford) on a data-driven factorization of the Fermi and WMAP data into spectral and spatial components.


PanSTARRS and the Tractor

In a final full-day sprint—and it was full day—Marshall and I got the Tractor running on a set of PanSTARRS cutouts. I am pretty stoked; it is the first third party full implementation of the Tractor and everything pretty-much worked out of the box. It wasn't trivial though. One of the big issues with the Tractor is that (fundamentally) it is an optimization code, and astronomers (even great ones like Lang) shouldn't be writing optimization codes; there are experts who do that. We probably should ingest Ceres. But as Lang points out, there aren't many options for optimization that capitalize on sparsity. In the background, Foreman-Mackey worked on running our post-Lucky stuff on the same PanSTARRS data, so we have a full suite of potential PanSTARRS tools coming along.


light deconvolution

The post-Lucky imaging pipeline that Foreman-Mackey and I are writing can be thought of as light deconvolution because it builds a deconvolved model of everything, but not all the way to infinite resolution. Today, the two of us spoke with Marshall and Cato Sandford (NYU) about doing the same on PanSTARRS and CS82 data to make constant-PSF human-viewable visualizations of single and multi-epoch multi-filter data sets. This might be Stanford's first project for this summer. We also worked on code and plotting improvements for our post-Lucky software, or really on the design thereof.


lens finder

Marshall and I spent the day in Princeton with Lang, extending the Tractor to handle gravitationally lensed quasars. We learned a lot about inheritance in object-oriented programming! I also spent a few minutes talking about the Atlas with Jim Gunn (Princeton) and saw a nice colloquium by John Johnson (Caltech). Johnson talked about exoplanet studies, and in particular convinced us that 0.7-m telescopes are cheap and ready to use.


evidence integrals, compiling?

In our weekly meeting, Hou, Goodman, Foreman-Mackey, and I discussed Hou's results trying to evaluate the marginalized likelihood integral. Also in the room was Marshall, visiting for the week from Oxford. Hou's early attempts using a Gaussian trial function and importance sampling are not working very well, so I suggested we make a set of simple functional tests on easier problems and Goodman suggested that we plug the emcee ensemble sampling operations into some kind of nested sampling code, including possibly Brewer's Dnest.

For the rest of the day we worked on getting started on putting real gravitational lens models into the Tractor, but we got stuck on scipy and Astrometry.net compilation issues on Marshall's laptop. Argh! Open-source code is not always easy to use.


gravitational telescopes

Ken Wong (Arizona) came through and we talked about using the SDSS photometrically selected luminous red galaxies to identify lines of sight along which there might be large gravitational magnification, to use as fixed, cosmological telescopes. The default plan is to essentially count LRGs on lines of sight; indeed early tests suggest that will be very successful. Jeremy Tinker (NYU) had some great suggestions, not the least of which was this: The most massive clusters actually have a low fraction of their masses in stars, so you should look not for galaxies, but for collections of galaxies that are cluster-like. I think he is probably right; we want to find lines of sight with multiple overlapping clusters. A few clusters might be more valuable than dozens or even hundreds of LRGs.


galaxy profiles

I worked a bit on my (very technical) paper on galaxy profiles. I greatly improved my code and its output, worked on the text and citations, and made use of the exceedingly useful information in Ciotti & Bertin (1999). One issue that I am avoiding relates to the Sersic profiles: I have mixture-of-Gaussian expansions for several kinds of profiles, but I have fit them all independently. Should I also try to find continuous transformations of the Gaussian amplitudes and variances as a function of Sersic index?



At lunch Kilian Walsh (NYU), Foreman-Mackey, and I discussed construction of a mini-LSST pipeline that would work on all the data ever submitted to Astrometry.net. To start, we are going to try to extract information from a small set of images overlapping a small patch of sky. But the long-term plan is to get Astrometry.net to be basing its work not on the USNO-B1.0 catalog but rather it's own catalog built from its own submitted data. Each incoming image will be calibrated with the current best information, and then used to adjust that information (based on discrepancies). And, of course, we want the system to make novel discoveries along the way.


writing day

Tuesday is my disappear-and-work day, so I disappeared and worked on my writing projects, including the Atlas, our lucky-imaging replacement paper, and one of the side papers for the Tractor. I also did a little math related to Brewer's sampling ideas of yesterday.


magnetic field, catalog sampling

At brown-bag today, Ronnie Jansson (NYU) gave a very nice talk about his work with Farrar (NYU) on the Milky Way magnetic field. They have made a multi-component model that obeys Maxwell's Equations (yes, that's a good thing) but also is maximum-likelihood against a set of rotation-measure and stokes Q and U data. They find an X-shaped field for the Milky Way viewed in projection by an external observer. This prediction is pretty hard to test directly (!) but it is consistent with observations of other nearby spirals. The biggest limitation of their work is the understanding of the Galactic high-energy electron density; their methods could be used to constrain both this and the magnetic field simultaneously; it sounds like that is in the plans. Great work.

In the afternoon, Foreman-Mackey and I teleconned with Brewer on our sampling-over-catalogs project. I argued for a descope to a minimal paper that gives a method that works, and leave all but the simplest applications to subsequent papers. I made this suggestion in part because I want to get things done and in part because it would be hard to describe the sampling over models of varying complexity and non-trivial scientific conclusions in the same abstract. During the call, we also discussed a novel way to do the sampling over varying complexity, and I noticed Brewer fork the code late in the day. The nice thing is that this project could generate some good idea in statistics (Brewer's home department when he starts his new job this year) as well as astronomy.


precise supernovae measurements

During a mental-health day (that is, no research or indeed any work at all), I did come in to see Saurabh Jha (Rutgers) give an excellent seminar on type Ia supernovae as tools for precise cosmological tests. He spoke about this year's Nobel Prize but also lots of technical details about the precision of supernovae observations and how that can be verified empirically—that is, without any working theory other than the cosmological principle (which can also be tested, of course!). It is a rare talk that can be so technical and absolutely engrossing at the same time. One of the most interesting ideas in the talk is that observation of even a single type Ia supernova behind a massive galaxy cluster can put a strong and unique constraint on the mass distribution because it directly tests the gravitational magnification. It breaks the mass sheet degeneracy that remains after shear fitting. That's sweet.


data-driven SNe modeling, plucky imaging

Or Graur (AMNH, Tel Aviv) showed up and he, Perez-Giz, and I discussed possible projects to reformulate and improve the models of supernovae that Graur is using to make discoveries in the SDSS and BOSS spectroscopy. I gave them copies of the HMF paper with Tsalmantza and encouraged them to think about extending the wavelength domain of the models as well as looking for coherent or consistent residuals. This is a nice sandbox for thinking about models that are data-driven but have lots of informative prior information.

On the lucky imaging front—and we need a new name for this project (since we aren't doing lucky imaging; are we doing plucky imaging?)—Bianco sent us all 30,000 frames of data she has on the difficult triple-star system, and we are running them all. The results look nice, but it looks to my eye like the signal-to-noise in the data is dominated by the best-seeing images. That is, traditional lucky imaging (which throws out the vast majority of the data) may not be throwing out as much signal-to-noise as I originally had imagined.


imaging, imaging, imaging, Atlas

After contemplating the wisdom of a Brewer comment on my post of two days ago, I reformulated my issue with resolution vs signal-to-noise: Deconvolution with the true PSF leads to very noisy true scenes (for lots of reasons not limited to the scene being much more informative than the data on which it is based, and small issues with scene representation being corrected by the generation of large, nearly canceling positive and negative fluctuations in the scene). I want to deconvolve with something narrower than the PSF, but which captures its speckly (or multiple-imaging) structure.

I succeeded in formulating that desire in code and it works. The idea is that I fit the PSF for each data image with a mixture of fixed-width Gaussians, but when I use the PSF to deconvolve the image, I use not the mixture of Gaussians but a mixture of delta-functions with the same positions and amplitudes but no widths. That is, I (in some sense) deconvolve the PSF before using the deconvolved PSF to deconvolve the scene. This prevents the code from deconvolving fully, and/or leaves a band-limited scene, and/or leaves the scene well sampled. Not sure if I can justify any of this, but it sure does work well in the (very hard) test case Bianco gave us.

My despair of yesterday lifted, the signal-to-noise appeared to increase with the amount of data, while the angular resolution of the scene held constant, and I conjectured that when we run on the full set of thousands of images we will get even more signal-to-noise without loss of angular resolution. This is the point: With traditional lucky imaging (or TLI), you shift-and-add the best images. Where you set that cut (best vs non-best) sets the angular resolution and signal-to-noise of your stack; they are inversely related. With the code we now have, I conjecture that we will get the signal-to-noise of the full imaging set but the angular resolution of the best. I hope I am right.

On a related note, Fergus and I talked about one important difference between computer vision and astronomy: In computer vision, the image produced by some method is the result. In astronomy, an image produced by some pipeline is not the result, it is something that is measured to produce the result. This puts very strong constraints on the astronomers: They have to produce images that can be used quantitatively.

I also did some work on the Atlas, both writing (one paragraph a day is the modest goal) and talking with Patel and Mykytyn, my two undergraduate research assistants at NYU.


going backwards

Today was negative progress in all matters. I took good code and borked it, and took good text and worsened it. The only positive moments were spent pair-outlining a Tractor paper with Lang.