linear algebra

I had the great privilege of a long conversation today with applied mathematician Mark Tygert (NYU), who works on, among many other things, linear algebra and statistics. I wanted to discuss with him the various things Lang and I are doing to optimize our billion-parameter model of the sky (or, really, our thousand-parameter model of a tiny sandbox part of the sky). He gave us various pieces of very valuable advice, including:

  • The Python scipy.sparse module for sparse-matrix manipulation is probably using ARPACK, which is also (not coincidentally) the right thing to be using for big problems like ours.
  • When your matrix is sparse, and you want to control its dynamic range by re-scaling rows (or columns) to help with numerics, there is no difference between using the L1-norm, L2-norm, or L-infinity norm. Sparseness makes these (close to) equivalent.
  • If, after normalizing rows, we get the SVD of our derivative matrix, the eigenvalues should be comparable; if this is the case, conjugate-gradient should work well.
  • We probably shouldn't be using anything like Gauss–Newton method; it relies too heavily on the objective function being very very close to parabolic in the parameters. Conjugate gradient is almost certainly better. It is also provably optimal among methods that aren't permitted to invert the matrix of relevance.
  • If we switch to conjugate gradient, we get a new knob, which is the trade-off between taking CJ steps and recomputing derivatives. That is both good and bad.
  • The only other good alternative to CJ is quasi-Newton. If we want to give that a shot, we should use the BFGS method and nothing else.
Tygert also pleased me by saying that if we succeed on our problem with many 1012 pixels and many 109 parameters, the results ought to be publishable in the machine-learning world, not just the astronomical world. I hope that's true. And I hope we succeed.


random beats grid

Today, Holmes and I closed the loop and showed that a multi-epoch survey with random image centers and orientations crushes a multi-epoch survey done on a regular grid for precision of any self-calibration. Next week we plan to make the simulated data more realistic, add imperfections to the model (in real life the model can never exactly capture reality), and find out the scalings of photometric accuracy with density of standard stars, number of epochs, and survey strategy decisions.


pair-coding successes

In a pair-coding session with Lang, he got the Tractor to fit galaxies to multiple overlapping images simultaneously. This problem is a hard one just because the (high-resolution) galaxy properties are fixed in celestial coordinates, while the point-spread function is different for every image and defined in image-pixel coordinates.

In another session with Holmes, we got ubercalibration completely working for the fake Euclid data Holmes has generated in the last few days. This is not a trivial feat, as it involves a thousands-of-parameters non-linear (bi-linear) least-squares fit. Our goal for tomorrow is to compare different deep-field observing strategies in terms of the quality of the inferred flat-field and photometric measurements. If we can get that done by tomorrow, we will be on our way to influencing survey strategy for Euclid and any number of other upcoming imaging projects.


bilinear ubercalibration

Holmes and I worked out the full computational formalism for ubercalibration, working with observed stellar fluxes and an unknown (but smooth) flat-field map. We are not taking the log and linearizing around that; we are doing the bilinear problem, where we estimate true fluxes at fixed flat-field map, and then the flat-field map at fixed true fluxes, iterate to convergence. Holmes is writing documentation (the paper) in parallel with the code so that we have a clear correspondence between the one and the other.


Euclid simulator

Holmes has written a good deal of code in these two days; he has from scratch code that makes a fake sky of infrared stars, and code that takes a fake (Euclid-like) image of that sky, and applies a fake flat-field and applies fake noise. Tomorrow we work on the optimization part: Can we recover the properties of the camera and the sky given the data, and how does this depend on survey strategy?



Rory Holmes (MPIA) showed up today for two weeks of coding and writing. We are asking—for Euclid and in general—how one should design an imaging strategy such that the data will be possible to self-calibrate or uber-calibrate. That is, how should you set the pointings so as to get good field coverage and good measurements (internally) of your system variations with space and time.

The big picture, for me, is this: Astronomers think in terms of taking science data and separately calibration data. The former is used for science and the latter is used to set instrument parameters like flat-field and bias and so-on. But typically you have far more photons in your science data; aren't they incredibly constraining on calibration themselves? Indeed, in the SDSS, we got much more calibration information out of the science data than the calibration data. But, of course, we had to adjust the SDSS imaging strategy to make it work: You need to have good redundancy in the data stream, and redundancy of a very specific kind. That's what we are setting out these two weeks.


photometric redshifts

Back in Seattle, Bovy, Hennawi, Myers, and I discussed the most general extreme deconvolution approach to photometric redshifts, with quasars as the sources of interest. Since it has been a whole week, Bovy has, of course, implemented fully what we had in mind! It performs incredibly well, naturally incorporates GALEX and UKIDSS data where we have them, can be marginalized along directions other than the photometric directions to give fully empirical (hierarchical) prior PDFs over quasar spectral energy distributions, and improves quasar target selection. It is pretty sweet! And XD is so fast now, he was able to optimize the (more than) ten-thousand parameters in an hour or two. Have I mentioned that I love my job?


fitting on the celestial sphere

Lang and I got the Tractor working today in celestial coordinates. Usually source detection, photometry, and morphological fitting is done in images in pixel space (x-y position on the CCD). But if you want to properly combine data from multiple images (and especially images in different bands or with different PSFs), you have to do your fitting simultaneously in both images, and use some world coordinate system to convert positions in one image to positions in another. We gave the Tractor this capability today, in a very one-sided pair-coding session (Lang worked, I chatted).


proposal, Tractor

Rob Fergus and I finished the proposal today. In the process, I think I got him psyched about (some aspects of) astronomy. We vowed to meet next week to talk about cosmic-ray modeling, which is absurdly naive right now, and key to getting good image models. Last night, after I posted here (but for a break from all this proposal-writing), Lang and I discussed the Tractor, our image-modeling vaporware. We figured out a minimal-scope paper to write, and code to release. Tomorrow we hope to get some traction (as it were).



proposal writing

Friday and all weekend was taken up with proposal writing, which is still on-going. A lot of the proposal is about replacing astronomical single-point estimates (that is, arithmetic operations on the data that produce a single estimate of a quantity, like a maximum-likelihood estimate of an object's position) with a probabilistic treatment (that is, a full likelihood function or a posterior PDF over the position).



On the plane home, I tried to sharpen up what the point is for Rob Fergus (NYU) and my attempt to get some funding for research bridging computer vision, computational photography, and astronomy. Interdisciplinarity (is that a word?) is hard: The languages are different, and the assumptions are even more different.

I also read and critiqued this amusing paper but my critique doesn't meet the criteria for inclusion in this exclusive blog.


AAS 217, day three

Over lunch, Bovy, Hennawi, Myers, and I discussed the next projects for the XDQSO project. We figured out how to do XD photometric redshifts, which is an amusing problem on which we might get excellent performance (by my definition of performance). We also discussed variability and the production, from our probabilistic results, a hard-cut catalog.

I went to the Milky Way session in the afternoon, which included very nice talks by Kevin Schlaufmann (UCSC) and, of course, Zolotov. Schlaufmann showed that there is significant substructure in radial-velocity space, and that the overdensities in radial velocity also show chemical-abundance differences from the smooth component. Some of his ECHOS (what he calls the substructures) did not look to me at first glance to be statistically significant on their face, but then they do show chemical-abundance offsets from the field that do, taken as a whole, look highly significant. So I guess Schlaufmann is right and I am wrong!

Zolotov followed with a highly relevant theory talk, showing that you expect chemical-abundance differences between halo stars that fell in and halo stars that were kicked out of an early central bulge or disk. In the question period, it came up that the in-situ stars (as she calls them) should be on nearly-radial orbits. Furthermore, if they are kicked out in a set of short-duration events, there could be very strong non-trivial structure in phase-space for these stars. Gotta think about that, and then convince Zolotov and her team to find the strongest signatures in the simulations.


AAS 217, day two

I gave my talk in a session that—being built around the NASA ADAP funding program—contained lots of nice discussion of engineering and code matters. John Moustakas (UCSD) spoke about PRIMUS, which is so impressive I am very proud to have played the tiny, tiny part I did. Gordon Richards (Drexel) showed that you can't reliably measure the accretion rate of a quasar without a lot of multi-band data; every quasar has oddities that make its optical accretion-rate estimate unreliable at the factor-of-few level. He also showed some nice systematics on the broad emission lines, which might be relevant to (or re-discovered by) my work with Hennawi and Tsalmantza on quasar redshifts.

In the afternoon, I met up with Christopher Stumm (Microsoft), in part to catch up and in part to lobby for writing up (for publication in the astrophysics literature) his integration of Astrometry.net into flickr. By performing this integration, he has built one of the most remarkable citizen-science platforms in existence. And he is a team of one!


talk preparation

On the airplane to Seattle for AAS 217, I prepared my talk slides. It is a 15-minute talk (number 206.04 if you are at the meeting), with the following summary:

  • Catalog matching is not a good way to combine imaging data sets
    • (the Right Thing To DoTM is build a simultaneous model of everything).
  • Forced photometry is a great approximation when one data set is better
    • (one data set is deeper and higher in angular resolution).
    • SDSS has this property relative to GALEX.
  • A typical quasar in SDSS is not clearly detected in GALEX
    • but nonetheless GALEX helps us find them,
    • GALEX photometry at < 5 σ can provide photometric redshifts,
    • and GALEX measures at high signal-to-noise their average properties.
  • We use this framework to measure the extragalactic UV background.
Now will the speaker ready room accept my home-built PDF presentation compiled from LaTeX?


ultraviolet interpolation

Schiminovich came down again and we got a bit more done on the ultraviolet background, working on the distribution of intensity as a function of quasar (source object) redshift. We haven't yet faced the need to K-correct the intensity information we have collected, but we will have to face that soon. We confirm, disturbingly well in some respects, models of the intergalactic ultraviolet intensity as a function of redshift.


UV radiation from quasars

Schiminovich came down for the day and we worked on the ionizing radiation in the Universe from quasars, as a function of redshift. We are making a direct measurement using GALEX, but also using SDSS to tell us where to photometer the GALEX data, which is too shallow to detect many of the quasars of interest. We discussed—and Schiminovich made some progress on incorporating—incompleteness information from (imperfect) quasar selection.


exoplanet radial velocity data

Debra Fischer (Yale) and Matt Giguere (Yale) came down to NYU today to discuss radial velocity data with Hou and me. We discussed many things, but the most interesting to me was the possibility that we could insert improvements into the spectral extraction and fitting pipelines. Even small improvements in these pipelines could lead to large scientific advances. Fischer handed us some data in different states of rawness.


a new Atlas of Galaxies?

I spent the day at Princeton, working with Lang on PHAT and the Tractor. In the afternoon I spent a short period discussing with Dave Spergel (Princeton) and Robert Lupton (Princeton) the possibility of producing a published atlas of galaxies from the SDSS. They were filled with good ideas for making such a book, and making it the de facto replacement of all previous atlases. They also suggested ways to connect to citizen science, which seems like a great opportunity. This all in preparation for AAS, where I hope to make a decision on this: Go or no go?


hierarchical star–galaxy classification

Ross Fadely and I co-wrote a document today laying out the basic hierarchical framework for performing star–galaxy classification in multi-band imaging. It involves a large set of stellar models and a large set of galaxy models. Then a hierarchical approach puts properly normalized and properly informative prior probabilities on all the models given the total data set. Each object then is classified using the prior information inferred from everything, and prior information and classifications are inferred simultaneously. Will it work? I have faith in the Church of Bayes!


the Tractor

Lang and I worked today a bit on generalizing the Tractor, our probabilistically justified source detection system. It runs on SDSS imaging, but we want to make it work on any imaging. We are not close.