2022-07-21

non-separable and generative random catalogs

Standard practice in large-scale structure is to make large-scale structure and cross-correlation measurements using a catalog of tracers (quasars in our case now) with random catalogs taking the role of tracking the selection function. In most cases this random catalog is made by sampling from a model for the angular selection function and, separately, for each object, sampling from a model for the radial selection function (redshift distribution in our case now). But of course the redshift distribution depends, in detail, on the angular selection function (because, for example, some of the angular selection is set by dust extinction). Kate Storey-Fisher (NYU) and I discussed now to capture these issues in the random we are building for the ESA Gaia quasar sample we are using. One idea is to give the randoms quasar-like luminosities and building the random catalog using our causal ideas about how things make it into the catalog

2022-07-20

reverse-engineering the quasar sample

Kate Storey-Fisher (NYU) (with some consulting by me and others) has been trying to model the selection function (in the form of an accurate random catalog) for the ESA Gaia DR3 quasar sample (or a cleaned-up version of it). We currently believe that the selection function should depend most on the Gaia scan history, the local stellar crowding, and the interstellar dust. We are finding that scan history is a very subtle (maybe ignorable) effect, and that dust is big. But when we apply the dust correction, the random-catalog features don't look quite like the features in the real data. Today KSF showed (anti-) correlations between the observed quasar density and the stellar density on the sky. Will correcting for this fix our issues? I sure hope so.

2022-07-19

refactoring geometry code

I spent the day refactoring the code I wrote (months ago) on geometric (scalar, vector, and tensor) convolution filters for convolutions on geometric (scalar, vector, and tensor) images. I refactored so that all kinds of geometric objects can be operated on transparently. The good thing is that we have provable tests that test our code for correctness (yay properties of groups!), but the bad thing is that the operations aren't trivial to implement correctly for arbitrary dimensions. I learned a lot about numpy.einsum().

2022-07-18

scope for a paper on convolutional functions

Today Soledad Villar (JHU) and I met to discuss the status of our project generalizing convolutional neural networks to images (or lattices) that contain vectors and tensors, respecting group-action properties. We realized that we have absolutely everything in place and we are only writing (and figure-making) away from having a paper on this. Our big issue is that we don't have an implemented application (we have lots of applications, none implemented). We decided that we would put the idea on the arXiv and then see who wants to implement!

2022-07-17

latent-variable model

On the weekend Adrian Price-Whelan and I decided that we have to implement the linear latent-variable model (our name!), which is a generative model for both features and labels, if we are going to do well labeling stars with ESA Gaia XP spectra at low signal-to-noise (faint magnitudes). The reason is: The latent-variable model generates the data, so it deals naturally with the different noise in stars of different brightnesses (or different signal-to-noise). We think this is important! We'll see this coming week.

2022-07-13

generative model for XP

Adrian Price-Whelan (Flatiron) and I worked out a form for a possible generative model for the ESA Gaia XP spectra and stellar parameters. The idea of going generative is that it should degrade at low signal-to-noise ratios more sensibly than discriminative models degrade. Discriminative models, as a reminder, find a function of the features that predict the labels. Generative models find a function of latents that predict both the features and the labels; labeling becomes an inverse or inference problem.

2022-07-12

distances from proper motions

Long, long ago, people used “reduced proper motion” to separate (for example) white dwarfs from main-sequence stars. Good! Now, in the age of ESA GAia, it is time to do better. Today in Milky Way Group Meeting at MPIA, both Coryn Bailer Jones (MPIA) and Eleonora Zari (MPIA) told us about projects to use the proper motions (and velocity structure models for the Milky Way) to infer distances. Bailer-Jones is trying to do something very general. Zari is doing something very specific: She cares about OB stars, which are primarily a thin-disk population, so she has a very specific kinematic model for their velocity distribution. She gets really nice results (see Appendix B).

2022-07-11

so much Gaia; #renameJWST

I spent the day working on ESA Gaia data, parallel to Kate Storey-Fisher (NYU). She was working on the quasar catalog and the correlation with the CMB convergence maps; I was working on estimating stellar luminosities from low-resolution spectral coefficients. We are too much in the thick of it to report how it's going yet. But stellar luminosities are hard to predict from spectra!

Late in the day I sent this letter to NASA:

I served on the US Astronomy and Astrophysics Advisory Committee (AAAC) for several years; I served on the NASA Spitzer Space Telescope Oversight Committee for many years; I served on the NASA Extragalactic Database User Committee for several years; and my research has been funded by NASA for my entire career (since I was a PhD student in the 1990s). I currently do research with HST, Kepler, TESS, 2MASS, WISE, and WMAP data, and now I'm getting ready for JWST and SPHEREx.

I am writing to say that I think it would serve NASA's interests, and the interests of NASA science especially, to rename JWST. There have been plenty of discussions of the name; it is clear that many scientists (and especially those who are part of the LBGT community or who have concerns for the LBGT community) feel disrespected by the name. I also personally think that the evidence is clear that some of the career activities of James Webb did direct harm to patriotic Americans who were gay.

I want to emphasize, however, that I think the important argument about the name goes beyond the question of any individual historical facts: The LBGT communities are of great importance to all of us. These voices must be heard, and the legitimate concerns must be addressed.

Because NASA is a forward-looking agency, and working towards a more equitable, better world, especially for people working in science and engineering in the US, I think it is time that the spacecraft be renamed. I think this could be done easily and without any trouble; many spacecraft have either officially or effectively changed names at or around first light, two examples that come immediately to mind are WMAP and Spitzer.

2022-07-09

Gaia quasar redshift distributions

Today I worked with Kate Storey-Fisher on the ESA Gaia quasar sample. We looked at the redshift distribution as a function of magnitude and as a function of color cuts or selection. Right now we are getting an odd angular correlation function for the sample, which we think is because our random catalog (or completeness model) is wrong in important but subtle ways. Interestingly, there are probably terms coming from both the dust map (extinction) and from the Gaia internal completeness (scanning law) and maybe both contribute enough to change the answer? But it sure looks like dust dominates.

2022-07-08

Doppler shifts and radial velocities

I am a big believer that there is a difference between a Doppler shift and a radial velocity. For one, they have different units! For another, the former is measurable (sometimes) and the latter is not (or rarely). But today I agreed with Megan Bedell (Flatiron) that we should write our paper on the subject in terms of the words “radial velocity” and not “Doppler shift”. After all, we are talking to a community with a common language! I spent some time on the train from Vienna to Heidelberg editing the figures for our paper on this subject.

2022-07-07

is it ever scientifically conservative to use machine learning?

I gave a talk Is machine learning good or bad for science? in Vienna today (slides here). I spent a lot of time on the ontology and epistemology of it all. One thing that led to some debate afterwards is my claim (at the end of the talk) that using extremely flexible machine learning methods can be extremely conservative in some cases: If you are modeling a nuisance that possibly interferes with your signal of interest, and you used a very flexible model, you have a strong argument that you tried as hard as you could (in some sense) to dilute your signal of interest with that nuisance. My talk was followed by interesting discussion with many, and a lovely dinner with Viennese (not just Austrian, but Viennese) wine.

2022-07-06

Dr Ratzenböck

It was my pleasure to be a part of the PhD committee for Sebastian Ratzenböck (Vienna), who wrote a dissertation in computer science but as applied to astrophysics. He had three advisors, in statistics, in computer science, and in astronomy, and he beautifully bridged the three worlds. His research was on finding members of stellar clusters, and on finding new stellar clusters. He showed (pretty convincingly, I think) that star-forming regions break up into many individual star-forming events with different ages and different kinematics. One of his conclusions is that all star formation happens in clusters or groups! He also made a nice technical advance, which was to build a tool to select clustering hyper-parameters in the space of physical quantities one cares about, instead of in the space of arbitrarily-defined clustering-method parameters. It was a great thesis, a beautiful defense, and a fun time drinking afterwards. Congratulations Dr Ratzenböck!

2022-07-05

is machine learning good or bad for science?

I took the train to Vienna today, for a PhD defense and to give a talk about machine learning. My talk is interdisciplinary so I looked at how to generalize my arguments about astrophysics to all of the natural sciences. It turns out that this isn't as easy as I'd like, since it is hard to be specific outside of astrophysics! I'm going to learn a lot getting this talk ready.

2022-07-04

Gaia quasar redshifts

Today Kate Storey-Fisher (NYU) arrived in Heidelberg and we worked on improving the ESA Gaia DR3 quasar redshifts, using NASA WISE and IRAS information. We were trying to reproduce experiments performed by Hans-Walter Rix (MPIA) last week. It looks like we can get clean-ish samples, where more than 90 percent of the redshifts are correct (as compared to ground-based spectroscopic surveys), provided that we stick to bright quasars. The question is: Is this good enough for our cosmological goals? We discussed how to empirically evaluate the selection function, which is our next task.

2022-07-03

bouncing ball

I worked on the weekend on making a toy problem for testing physics-related machine-learning methods: A ball bouncing off an elastic surface, under gravity. Both the surface and the gravity vector break the symmetry; this problem is not at all invariant with respect to rotation, translation, or boost. And yet the laws of physics can be written in a coordinate-free form. I am trying to figure out whether we can make this distinction usefully in the literature: The distinction between coordinate-free and equivariant. I think they are different concepts, even though the mathematics of them are identical.