More Atlas writing. That is all.
One piece of advice I give to thesis-writers is to write the acknowledgements early. If you don't, you will forget people, and (also) precise wording is valuable (so letting it get re-read many times is good). Taking my own advice, I started writing the acknowledgements for the Sloan Atlas of Galaxies today. I also had a long conversation with Fergus about next steps on all our projects.
On a packing-and-travel-saturated day, research accomplishment was thin, although on the plane I took the first steps towards putting analytic derivatives into our high-resolution Herschel dust emissivity model. The derivatives should enormously speed optimization, if Fergus and Krishnan are to be believed.
Today was my last day at MPIA for 2012. I spoke at length with Karin Sandstrom (MPIA) about fitting combined HI, CO, and dust-emission data, and with Sandstrom and Kapala about Kapala's comparison of ultraviolet stars with Herschel dust. I also had a final conversation with Tsalmantza about all the lensing-related work she has been doing; she is still finding more candidate gravitational lenses and has prepared a long document for publication or follow-up, filled with candidates. Expect them to hit my ideas blog any day now! I am not getting any traction on my Atlas, which is what I have to finish in the next two months! Don't distract me, people!
After a long hiatus Lang and I resumed our quasi-regular Skype calls to discuss The Tractor and publications thereof. We sure are taking a long time to deliver! Lang proposed a brilliant de-scope that permits a paper to be written in short order without much additional engineering beyond what we have already done: Let's write a paper in which we synthesize (make model images for) every SDSS field imaged to date, in all five bands, and do investigations on the residuals. Investigations can include goodness-of-fit-like statistics but also overlap of the residuals with derivatives of the model with various parameters (like object positions, fluxes, and shapes). The idea would be to use the official SDSS catalog and calibration parameters (astrometry, PSF, photometric zeropoints) without analysis, so that the synthesis would be a full-survey vet of the full system. The fundamental assumption is that the catalog is a set of model parameters that describe quantitatively the data. That would be fun, and show that we can do something, at least,
at scale. I also had interesting conversations with Tom Robataille (MPIA) about astropy and the future of data analysis, and Kapala about ultraviolet HST data and dust.
So many papers are so close to being done and yet...! Today Holmes and I gave a final push on responding to referee on our self-cal paper. We just have to let all the code run through one more time, update the final numbers and submit (I very much hope). The flip side of all the work is that the paper is very much improved, thanks to a good referee and many comments from the community.
On a low-research day I prepared and gave a short coffee talk at MPIA about our work on the insane-robot censored-data project (with Richards, Long, Foreman-Mackey, and Bloom). After me Ronald Läsker (MPIA) spoke about galaxy bulge-disk fitting of two-dimensional galaxy images and the insanity of it all; what you get for the bulge mass is a very strong function of how many components you include and which ones you think of as being
bulge. He finds that adding more galaxy components can increase or decrease the inferred bulge mass by factors of a few in typical cases. So unless someone has a full kinematic decomposition, don't believe a bulge mass uncertainty that is more precise than a factor of two! In general, the problem with bulge measuring is in the interpretation of fits that are otherwise extremely valuable for photometry and other purposes; if we just think of the amplitudes of the components as uninterpretable latent parameters, they don't cause any trouble at all, and you still get good colors and magnitudes.
I had a long conversation with Joe Hennawi and Beta Lusso (MPIA) about Lusso's very nice fitting of multi-wavelength photometry of AGN with a combined stars plus disk plus torus plus cold dust model. She has a few to hundreds of templates for each of the four components so she is fitting tens of millions of qualitatively different models. I think the performance of her fitting could be improved by learning priors on the templates much as we did for our hierarchical bayesian star–galaxy classification project. Of course it would be very computationally expensive and it might not help with her core goals, so I wasn't advocating it strongly.
However, I do believe that if you have enormous numbers of templates or archetypes or models for some phenomenon and you have many data points or real examples, you really have to use hierarchical methods to control the complexity of the problem: There is no way that all of your models are equally plausible a priori, and the best way to set their relative plausibilities is to use the data you have.
This also resolves an age-old problem: How do you decide how many templates or archetypes to include? The answer is: Include them all. The hierarchical inference will take care of down-weighting (even zeroing-out, in our experience) the less useful ones and up-weighting the more useful ones.
This whole thing is pretty deep and I can write or talk about it for hours: In real fitting situations with real data, it is always the case that your model is both too flexible and not flexible enough. It is too flexible because your model space permits fitting data that you could never possibly see for lots of reasons fundamental and incidental. It is not flexible enough because in fact the models are always wrong in subtle and not-so-subtle ways. I have a growing idea that we can somehow solve both of these problems at once with a
flood with archetypes, mop up with hierarchical inference approach. More on this over the next few years!
I wrote some code to re-compute the GALEX spacecraft orientation given a first guess. My code just looks for the linear combination of attitude components that makes a chosen compact astronomical source compact on the celestial sphere. It seems to work on fake data; the plan for this week is to incorporate it into our code that plots the full photon time history of any GALEX source. Then it will permit that code to output probabilities (or odds), one for each photon, that the photon was generated by the smooth sky background as opposed to the chosen compact source, under assumptions about source variability and other things.
After great conversations in Valchava, visions of sugar-plums and spike-slab priors have been dancing in my head. So I spent today (and much of the past weekend) writing furiously a couple of documents. The first is a white-paper regarding a justified probabilistic approach to radio interferometry data. This captures things my loyal reader has been reading about for the last few weeks, plus many new ideas coming from Valchava.
The second document is a white-paper regarding the possibility of calibrating the pixel-level flat-field map (that is, many millions of parameters) for an imaging camera using the science data alone (that is, no sky or dome flats). This latter project would work by looking at consistency among neighboring pixels given (and this requires assumptions) an observing program that observes without trying to put particular science objects in particular locations. One of the things I love about this project is that sky flats can be very misleading: The smooth sky illuminates the detector differently than a star does, because there are often unbaffled stray light paths and reflection paths. I have been thinking about these issues because they might be applicable to the Euclid mission, which currently plans to rely on internal LED illuminators for calibration information.
I am not sure why I write these white papers. They are a huge amount of work, and although they are great for organizing and recording my ideas, almost none of them ever results in a real publication.
Today was radio-interferometry day. Rix and I explained our issues with CLEAN, and we discussed with the computer vision types. Fergus described CLEAN as a
conservative greedy algorithm, which is very appropriate. In the end we had some good ideas about how to attack the problem, with agreement from everyone. The only point of concern was between those (Harmeling representing) who thought we might be able to proceed without strong priors and those (Fergus and Schuler representing) who thought that strong priors would be necessary to deal with the spatial frequencies at which there is no u-v coverage. At the end of the day I was assigned the task of writing down what we learned and agreed upon. I started on that.
One interesting thing about CLEAN is that it averages visibilities on a grid in u-v so that it can use the fast fourier transform. This is a great idea, but makes forward modeling harder. We are hoping we can stay fast without this trick. In our vision (codenamed NAELC) we never transform in that direction anyway.
Rix arrived in Valchava today, and (after a morning hike) proceeded to talk about making high-resolution models of multi-resolution (or heterogeneous) multi-spectral data, with the example being our dust maps made from Herschel data. One thing I learned in the discussion is that we maybe should be using L-BFGS-B optimization instead of the hacky Levenberg-Marquardt-like thing we are doing now. I was surprised, since Lev-Mar-like methods know that the problem is least-squares-like, whereas L-BFGS-B won't, but the experts (especially Dilip Krishnan) advised strongly.
After another morning hack session where the only real progress was getting spectral classification information for Muandet, Rob Fergus and I talked about astronomy challenges of relevant to the computer vision types. Fergus talked about our project on high dynamic-range imaging, where the problem is to make a data-driven model of an extremely complicated and time-variable point-spread function when we have very few samples. This wouldn't be hard if we didn't care about finding exoplanets that are much much fainter than the speckley dots in the PSF. The advantage we have with the P1640 data we have been using is that the speckles (roughly) expand away from the optical axis with wavelength, whereas faint companions are fixed in position.
I talked about what I see are three themes in astronomy where astronomers need help: (1) In large Bayesian inferences, we need methods for finding, describing, and marginalizing probability distributions. (2) We have many problems where we need to eventually combine supervised and unsupervised methods, because we have good theories but we can also see that they fail in some circumstances or respects (think: stellar spectral models). (3) The training data we use to learn about some kind of objects (think: quasars) is never the same in signal-to-noise, distance, brightness, redshift, spectral coverage, or angular resolution as the test data on which we want to run our classifications or models next. That is, we can't use purely discriminative models; we need generative models. That's what XDQSO did so well.
On the last point, Schölkopf corrected me: He thinks that what I need are causal models, which are often—but not always—also generative. I am not sure I agree but the point is interesting.
In the morning, Schuler, Harmeling, Hirsch, Schölkopf, Hormuth, and I started the necessary work to convert some of Schuler and Hirsch's blind deconvolution code over into a package that could take an arbitrary astronomical image and return a spatially varying point-spread function. We also, on the side, struggled with downloading and compiling Astrometry.net code. Schölkopf was surprised to hear that we don't have funding for that project!
In the afternoon, Dilip Krishnan (NYU) talked about his work with Fergus to remove rain and other kinds of complex, heterogeneous occlusions from photographs. They are using a discriminative (rather than a generative) approach, which in this case means that they train with data as close as possible to the data that they want to fix. That sets up some ideas for my talk tomorrow. There was also discussion of his and Fergus's very clever priors on image gradients in natural images: There are some 0.8 exponents that amuse me.
Felix Hormuth (MPIA) talked about the lucky imaging camera AstraLux, which he built and has been very successful and was incredibly cheap to build. He showed the data and gave the crowd some ideas about how astronomical seeing arises. He also showed some data sets that do not properly reduce under the standard, vanilla lucky-imaging-style pipelines. He distributed data to the interested parties.
Sam Hasinoff (Google) continued the craziness of yesterday, talking about inferring a scene from the shadows and reflections it creates in illuminated objects. Lambertian objects (objects that don't produce specular reflections) are not very informative when they are convex, but become valuable when their shapes get more complex. He showed some very nice results reconstructing Mars from single-pixel photometry data, which are very related to recent exoplanet studies with Spitzer.
Krik Muandet (Tübingen) showed
support measure machines, which are a generalization of SVMs attempting to deal with the idea that the data might in fact be probability distributions rather than samples. He is trying to improve upon XDQSO target selection. Tomorrow I am going to show him how to test his results using the just-released SDSS-III DR9 Data
Today was the start of a computer-vision-meets-astronomy workshop convened by Michael Hirsch and Bernard Schölkopf in Valchava, Switzerland. The talks are informal, discussion ample, and we plan to spend the mornings hacking on problems of mutual interest. I came down from Heidelberg with Felix Hormuth (MPIA), who brought a few hard drives of lucky-imaging data for us to play with.
Bill Freeman (MIT) kicked things off by talking about his project to make an image of the Earth, using Earthshine on the Moon as his data source. The idea is that the details of the shadows on craters should encode (low-resolution) information about the image of the Earth. Crazy! It is like an inverted or messed-up pinhole camera, and he showed some outrageously high-performance prior work using occulters in an otherwise open spaces to do imaging (like video of people walking around in rooms compared with reference images telling you what scene is outside the window of the room!). He also showed some work exaggerating or amplifying tiny differences in images used to extrapolate slow processes in time; I recommended he look at HST imaging of things like V838 Monocerotis to make data-driven predictions of the pre-explosion and future state.
Christian Schuler (Tübingen) showed incredible results on blind and non-blind deconvolution of images with nasty or complicated point-spread functions. Schölkopf and I got interested in applying his methods to astronomical data—particularly his methods for determining the PSF blind—we plan to hack on this tomorrow morning.
Mykytyn continued to work on our measurements of huge galaxies, after a lot of fiddling around with various kinds of galaxy radii. He also figured out that we can measure M51a no problem, and M51b will probably work too, so we can do some pretty big galaxies with The Tractor.
While he was shepherding fits, I was building a data-driven model of the psf from the Fizeau Interferometer called LMIRcam on LBT (thanks to Andy Skemer of Arizona). Here's an example (below). The left is the data, and the right is my model. It looks just like a textbook example of an optical interferometry PSF (which up to now have only appeared in texbooks, never in the readout data from real instruments)!
In between talks by KG Lee (MPIA), Bovy, and Farrar (NYU), Mykytyn and I continued to work on the Atlas measurements, and Kapala and I looked some more into the relationship between Herschel dust maps and PHAT extinction measurements on luminous stars. The latter is quite a mess and we can't figure out what parts of the mess are problems with the star fitting or real variance in the extinction, and what parts of the mess are problems with the star fitting or real correlations between star and dust properties. We discussed with Groves.
The radio world uses CLEAN—and an infinite variety of alternatives and modifications thereto—for building radio interferometry maps. Today I met with Fabian Walter (MPIA), Frank Bigiel (Heidelberg), Tom Herbst (MPIA), and Rix to discuss probabilistic replacements. While we don't have a plan, we were able to come up with the following reasons that CLEAN needs improvement (these all apply to vanilla CLEAN; there are non-vanilla versions that fix some of these issues but not all, to my knowledge): (1) It is some kind of optimizer, but a heuristic optimizer (that is, there are optimizers out there with better properties) and it is not optimizing a well-specified (let alone justified) scalar objective. (2) It requires heuristic decision-making about stopping. (3) The noise in the final map it makes can only be estimated by taking the rms in
empty regions; it has no error or noise model and doesn't propagate uncertainties from the fundamental visibilities (let alone calibration uncertainties and so on). (4) The final map it makes has some flux convolved with the clean beam, and some fainter flux convolved with the dirty beam; this means that there is no well-defined point-spread function and the absolute calibration of the map is ill-defined. (5) It provides no mechanism for producing or quantitatively comparing alternative maps; that is, it produces only a point estimate, no judgement of that estimate or sampling around that estimate. (6) It requires binning of the visibility data in the u-v plane. (7) There is no mechanism by which the map it produces is standardly or by convention projected back into the space of visibilities and compared to the raw data to vet or confirm any fundamental or even heuristic noise model.
In addition to all this I could add things about the scene model and priors on the scene model and so on, but these aren't really the problem of CLEAN itself. In other news, Mykytyn continued working on getting large numbers of RC3 galaxies measured by The Tractor and even started writing up some of his work.
As much as a critic as I am of made-up statistics, Mykytyn and I spent part of the day measuring concentration parameters on our models of (angularly) large galaxies in the SDSS. We needed to do something like this because our galaxy models are not simple enough to have their radial profile described simply by any simple combination of fundamental parameters. We are working on fitting models to every galaxy—from any catalog—bigger than 2 arcminutes.
I worked out a graphical model for the GALEX photon project, and it reminded me that in order to operate on the photons, we need a spacecraft attitude model, a focal-plane mapping model (WCS), a point-spread function model, a model of the focal-plane sensitivity map, and a background photon rate model. With all these in place we can do any time-domain projects. I also thought a tiny bit about how we might release these functions to the public when we release the data. In the meantime, Greenberg and Gertler are working on approximations, and re-discovering our transiting sources (which we still have to write up, Schiminovich!). In other news, David Mykytyn arrived in Heidelberg today for a week-long sprint on the Atlas.
It's true, folks: The dust that radiates in the infrared is the same as the dust that absorbs in the visible and ultraviolet! Some may say this has been known for decades, but I successfully confirmed it today by comparing the dust models I can make for multi-band Herschel data on the M31 disk with extinction maps made from HST-imaging visible and UV stellar colors by Karl Gordon and Julianne Dalcanton of the PHAT Collaboration. My comparison is purely qualitative at present but I hope in the next week or two to make it quantitative. I want to find the combination of emission-inferred dust column, temperature, and emissivity parameter that best predicts the visible extinction; I then hope to find that it is only the column that is heavily involved in the prediction. That would be a great success of my (completely trivial, null) dust absorber and emitter model. I discussed all this today in one of the MPIA lounges with Ben Weiner (Arizona), Brent Groves (MPIA), Karin Sandstrom (MPIA), and Rix, with cameos by Thomas Henning (MPIA) and Tom Herbst (MPIA).
In MPIA Galaxy Coffee, I talked about our project to liberate all the photons GALEX has ever seen. I got lots of good comments and questions. After me, Decarli (MPIA) talked about one of the odder binary-quasar candidates that Tsalmantza and I found. The system is so odd that Rix commented that we should send it to Halton Arp: It looks like a galaxy ejecting a quasar or vice versa! As I have commented here before, our search for binary quasars has shown how rare they are: There really are no clear examples of two quasars orbiting each other at thousands (rather than hundreds) of km/s or faster. Maybe there shouldn't be such objects, but it is amazing that there really are none among the tens of thousands we have searched. In the afternoon, Weisz and I worked on some of his PDMF-fitting code.
In a low-research day, Ben Weiner (Arizona) showed me his impressive redshift survey using HST in objective-grism mode. The data are beautiful and the extracted spectra provide redshifts far fainter than anything I have ever seen.