jackknife the sky?

On the weekend, Kate Storey-Fisher (NYU) and I implemented jackknife uncertainty estimation for KSF's cosmology-with-Gaia-quasars project. We jackknifed by cutting the sky into RA slices. This is standard practice but I don't love it! It assumes that you know that your main source of error is calibration or sample consistency over the sky. It might be something way more insidious. In principle I guess you should jackknife over many things, and also randomly.


uncertainty propagation for a neural network

Today Matthias Samland (Stockholm) gave a nice Königstuhl Colloquium at MPIA about direct imaging of exoplanets with high-contrast imaging. He showed some beautiful results from ESO Gravity and from NASA JWST. One of his main take-away points is that the situation is changing fast, and we might achieve very much higher contrast ratios in the near future than we've ever had, and thus get many more planets.

I spent some time late in the day looking at uncertainty propagation for neural networks: Given that you can optimize a NN, and given that it makes good predictions for held-out data, and given that you can take all derivatives of everything with respect to everything, does that mean you can propagate errors or noise from the data to the results? I think the answer is yes in a limited sense: You can see how the output depends on the input at the training step. But what you can't do—and probably will never be able to do—is propagate the uncertainties that come from your training set (the uncertainties in your weights, as it were). And these uncertainties can be very large, especially since the models tend to be enormously over-parameterized, and also contain combinatorially large exact and near-exact degeneracies. (I think maybe the near-exact degeneracies are worse than the exact ones.) I vaguely recall Tom Charnock making strong statements about all these things at Ringberg.


do we have the baryon acoustic feature?

Today I posted this tweet (below), which I think explains what happened today! I also gave a talk at MPIA Galaxy Coffee, with Adrian Price-Whelan (Flatiron), about the appearance of stellar parameters in the ESA Gaia XP spectral coefficients.


definition of the pseudoscalar, pseudovector, and pseudotensor

I am fully obsessed with geometry these days. In particular, I am obsessed with the point that scalars aren't just numbers, but rather numbers that don't depend on your choice of coordinate system. Similarly, vectors aren't just things with a magnitude and a direction: They are things with a magnitude and a direction which are coordinate free, or which have a stable direction and magnitude no matter what coordinate system you choose. Thus, for example, the unit vectors defining the x, y, and z directions of your coordinate system are not really vectors at all. But the acceleration due to gravity right here is a vector.

But there are pseudo- quantities. For example, the angular momentum isn't exactly a vector; it is a pseudo-vector: It's magnitude and direction doesn't depend on the orientation (or translation) of the coordinate system, but its direction does depend on the handedness of the coordinate system. Thus there are pseudoscalar, pseudovector, and pseudotensors in addition to scalars, vectors, and tensors. Today Soledad Villar (JHU) wrote definitions for these in the paper we are drafting. It isn't trivial, because we want a notation that is agnostic about the group operator and the thing it is operating on.


quasars and stellar density

Kate Storey-Fisher (NYU) has made really nice random catalogs that look very very similar, in sky coordinates, to the quasars we have. However, there is obviously more exclusion of quasars from the Galactic plane region than we can explain with any reasonable model of how dust is affecting things. It's the stellar density of course: ESA Gaia selection is very sensitive to stellar density, especially (as now) when you are using the XP spectra. Today she included the stellar density in the random-catalog regression and boom: Excellent random! Our model is not mechanistic, it is effective and data-driven.


writing in the discussion section

I spent time in an undisclosed location on the weekend writing in the discussion section of my draft with Megan Bedell (Flatiron) on information theory and extreme precision radial-velocity measurement. I think my language is a bit loose when I write on vacation!


non-separable and generative random catalogs

Standard practice in large-scale structure is to make large-scale structure and cross-correlation measurements using a catalog of tracers (quasars in our case now) with random catalogs taking the role of tracking the selection function. In most cases this random catalog is made by sampling from a model for the angular selection function and, separately, for each object, sampling from a model for the radial selection function (redshift distribution in our case now). But of course the redshift distribution depends, in detail, on the angular selection function (because, for example, some of the angular selection is set by dust extinction). Kate Storey-Fisher (NYU) and I discussed now to capture these issues in the random we are building for the ESA Gaia quasar sample we are using. One idea is to give the randoms quasar-like luminosities and building the random catalog using our causal ideas about how things make it into the catalog


reverse-engineering the quasar sample

Kate Storey-Fisher (NYU) (with some consulting by me and others) has been trying to model the selection function (in the form of an accurate random catalog) for the ESA Gaia DR3 quasar sample (or a cleaned-up version of it). We currently believe that the selection function should depend most on the Gaia scan history, the local stellar crowding, and the interstellar dust. We are finding that scan history is a very subtle (maybe ignorable) effect, and that dust is big. But when we apply the dust correction, the random-catalog features don't look quite like the features in the real data. Today KSF showed (anti-) correlations between the observed quasar density and the stellar density on the sky. Will correcting for this fix our issues? I sure hope so.


refactoring geometry code

I spent the day refactoring the code I wrote (months ago) on geometric (scalar, vector, and tensor) convolution filters for convolutions on geometric (scalar, vector, and tensor) images. I refactored so that all kinds of geometric objects can be operated on transparently. The good thing is that we have provable tests that test our code for correctness (yay properties of groups!), but the bad thing is that the operations aren't trivial to implement correctly for arbitrary dimensions. I learned a lot about numpy.einsum().


scope for a paper on convolutional functions

Today Soledad Villar (JHU) and I met to discuss the status of our project generalizing convolutional neural networks to images (or lattices) that contain vectors and tensors, respecting group-action properties. We realized that we have absolutely everything in place and we are only writing (and figure-making) away from having a paper on this. Our big issue is that we don't have an implemented application (we have lots of applications, none implemented). We decided that we would put the idea on the arXiv and then see who wants to implement!


latent-variable model

On the weekend Adrian Price-Whelan and I decided that we have to implement the linear latent-variable model (our name!), which is a generative model for both features and labels, if we are going to do well labeling stars with ESA Gaia XP spectra at low signal-to-noise (faint magnitudes). The reason is: The latent-variable model generates the data, so it deals naturally with the different noise in stars of different brightnesses (or different signal-to-noise). We think this is important! We'll see this coming week.


generative model for XP

Adrian Price-Whelan (Flatiron) and I worked out a form for a possible generative model for the ESA Gaia XP spectra and stellar parameters. The idea of going generative is that it should degrade at low signal-to-noise ratios more sensibly than discriminative models degrade. Discriminative models, as a reminder, find a function of the features that predict the labels. Generative models find a function of latents that predict both the features and the labels; labeling becomes an inverse or inference problem.


distances from proper motions

Long, long ago, people used “reduced proper motion” to separate (for example) white dwarfs from main-sequence stars. Good! Now, in the age of ESA GAia, it is time to do better. Today in Milky Way Group Meeting at MPIA, both Coryn Bailer Jones (MPIA) and Eleonora Zari (MPIA) told us about projects to use the proper motions (and velocity structure models for the Milky Way) to infer distances. Bailer-Jones is trying to do something very general. Zari is doing something very specific: She cares about OB stars, which are primarily a thin-disk population, so she has a very specific kinematic model for their velocity distribution. She gets really nice results (see Appendix B).


so much Gaia; #renameJWST

I spent the day working on ESA Gaia data, parallel to Kate Storey-Fisher (NYU). She was working on the quasar catalog and the correlation with the CMB convergence maps; I was working on estimating stellar luminosities from low-resolution spectral coefficients. We are too much in the thick of it to report how it's going yet. But stellar luminosities are hard to predict from spectra!

Late in the day I sent this letter to NASA:

I served on the US Astronomy and Astrophysics Advisory Committee (AAAC) for several years; I served on the NASA Spitzer Space Telescope Oversight Committee for many years; I served on the NASA Extragalactic Database User Committee for several years; and my research has been funded by NASA for my entire career (since I was a PhD student in the 1990s). I currently do research with HST, Kepler, TESS, 2MASS, WISE, and WMAP data, and now I'm getting ready for JWST and SPHEREx.

I am writing to say that I think it would serve NASA's interests, and the interests of NASA science especially, to rename JWST. There have been plenty of discussions of the name; it is clear that many scientists (and especially those who are part of the LBGT community or who have concerns for the LBGT community) feel disrespected by the name. I also personally think that the evidence is clear that some of the career activities of James Webb did direct harm to patriotic Americans who were gay.

I want to emphasize, however, that I think the important argument about the name goes beyond the question of any individual historical facts: The LBGT communities are of great importance to all of us. These voices must be heard, and the legitimate concerns must be addressed.

Because NASA is a forward-looking agency, and working towards a more equitable, better world, especially for people working in science and engineering in the US, I think it is time that the spacecraft be renamed. I think this could be done easily and without any trouble; many spacecraft have either officially or effectively changed names at or around first light, two examples that come immediately to mind are WMAP and Spitzer.


Gaia quasar redshift distributions

Today I worked with Kate Storey-Fisher on the ESA Gaia quasar sample. We looked at the redshift distribution as a function of magnitude and as a function of color cuts or selection. Right now we are getting an odd angular correlation function for the sample, which we think is because our random catalog (or completeness model) is wrong in important but subtle ways. Interestingly, there are probably terms coming from both the dust map (extinction) and from the Gaia internal completeness (scanning law) and maybe both contribute enough to change the answer? But it sure looks like dust dominates.


Doppler shifts and radial velocities

I am a big believer that there is a difference between a Doppler shift and a radial velocity. For one, they have different units! For another, the former is measurable (sometimes) and the latter is not (or rarely). But today I agreed with Megan Bedell (Flatiron) that we should write our paper on the subject in terms of the words “radial velocity” and not “Doppler shift”. After all, we are talking to a community with a common language! I spent some time on the train from Vienna to Heidelberg editing the figures for our paper on this subject.


is it ever scientifically conservative to use machine learning?

I gave a talk Is machine learning good or bad for science? in Vienna today (slides here). I spent a lot of time on the ontology and epistemology of it all. One thing that led to some debate afterwards is my claim (at the end of the talk) that using extremely flexible machine learning methods can be extremely conservative in some cases: If you are modeling a nuisance that possibly interferes with your signal of interest, and you used a very flexible model, you have a strong argument that you tried as hard as you could (in some sense) to dilute your signal of interest with that nuisance. My talk was followed by interesting discussion with many, and a lovely dinner with Viennese (not just Austrian, but Viennese) wine.


Dr Ratzenböck

It was my pleasure to be a part of the PhD committee for Sebastian Ratzenböck (Vienna), who wrote a dissertation in computer science but as applied to astrophysics. He had three advisors, in statistics, in computer science, and in astronomy, and he beautifully bridged the three worlds. His research was on finding members of stellar clusters, and on finding new stellar clusters. He showed (pretty convincingly, I think) that star-forming regions break up into many individual star-forming events with different ages and different kinematics. One of his conclusions is that all star formation happens in clusters or groups! He also made a nice technical advance, which was to build a tool to select clustering hyper-parameters in the space of physical quantities one cares about, instead of in the space of arbitrarily-defined clustering-method parameters. It was a great thesis, a beautiful defense, and a fun time drinking afterwards. Congratulations Dr Ratzenböck!


is machine learning good or bad for science?

I took the train to Vienna today, for a PhD defense and to give a talk about machine learning. My talk is interdisciplinary so I looked at how to generalize my arguments about astrophysics to all of the natural sciences. It turns out that this isn't as easy as I'd like, since it is hard to be specific outside of astrophysics! I'm going to learn a lot getting this talk ready.


Gaia quasar redshifts

Today Kate Storey-Fisher (NYU) arrived in Heidelberg and we worked on improving the ESA Gaia DR3 quasar redshifts, using NASA WISE and IRAS information. We were trying to reproduce experiments performed by Hans-Walter Rix (MPIA) last week. It looks like we can get clean-ish samples, where more than 90 percent of the redshifts are correct (as compared to ground-based spectroscopic surveys), provided that we stick to bright quasars. The question is: Is this good enough for our cosmological goals? We discussed how to empirically evaluate the selection function, which is our next task.


bouncing ball

I worked on the weekend on making a toy problem for testing physics-related machine-learning methods: A ball bouncing off an elastic surface, under gravity. Both the surface and the gravity vector break the symmetry; this problem is not at all invariant with respect to rotation, translation, or boost. And yet the laws of physics can be written in a coordinate-free form. I am trying to figure out whether we can make this distinction usefully in the literature: The distinction between coordinate-free and equivariant. I think they are different concepts, even though the mathematics of them are identical.