writing an interpolator in jax

Today I pair-coded with Matt Daunt (NYU) a general kernel-based interpolator in jax. Actually, I am not certain that what we wrote can be handled by jax gracefully, because I don't understand the functional programming model. But I learned that interpolation is harder than I thought: If your interpolator is higher-order than linear, the interpolation involves solving an inverse problem. The fast interpolators (like spline) have such solutions coded with clever iteration schemes that require only one pass through the data, like the fast GPs we love. But if you want to write something general, you might have to suck up some (sparse) linear algebra. Daunt and I discussed strategies for our spectroscopic model, given these realities.


never shift-and-coadd your data!

I had a great (weekly) call with Andy Casey (Monash) today. We discussed many things, including a hilarious idea for April Fools' that Matt Daunt (NYU) and I conceived. But one of my action items out of the meeting is to start a short note on how to avoid ever shifting and stacking your data (think spectra taken at different times of year, so at different Doppler shifts relative to the Earth). How can you combine data without doing any interpolation of the data? The answer is simple: You forward-model your data with a model that (by optimization) becomes the average of those data. (And the optimization is often closed-form.) Then you only ever shift the model, and you don't have to deal with shifting noise arrays or mask arrays, or anything else. I'll try to write a stub this weekend.


can a (say) 5-sigma result be interpreted as a p-value?

In preparing for class today (I am teaching NYU #data4physics), I worked through the relationship between a p-value (like what's used in medical research) and a physicist's n-sigma measurement. They are related in some very special cases, like in particular when the value being measured is a linear parameter (like an amplitude) and the noise is Gaussian. But those cases are special. And also: Converting n-sigma to p-value depends very critically on the noise model. So I don't like thinking of it as a p-value. That said, maybe there is no difference?


patches of imaging

I am discussing with Sean Ku (NYU) and Victor Kuang (NYU) the NASA Cassini imaging of Saturn. We want to make (from the data) a high-quality face-on picture of the rings. This is a problem (from my perspective) in computer vision, so we need a camera model (and a lot of other things). One thing I hypothesized about this problem today is the following (am I right?):

Any sufficiently small patch of a camera image can be modeled with a pinhole-camera-like camera model, provided that we give the camera model the freedom to make the image plane not perpendicular to the line from the pinhole to the patch of the image plane. Is this correct? We are about to find out, the hard way.


writing for deadline

I don't like the way computer science works, with conferences and deadlines. I could complain about it for hours. Journals are a thing, people! But anyway, at Dagstuhl last week, Villar, Schölkopf, and I decided to sprint out a paper for the physics-and-ML workshop at NeurIPS this year. Deadline is in two days. Guess what I did with all my research time today?


Dagstuhl, day 4

Today was day 4 of Machine Learning for Science: Bridging Data-driven and Mechanistic Modeling at Schloss Dagstuhl.

I spoke today, about passive symmetries and the constraints on machine-learning models they imply. My talk was totally new for me, and based on conversations between Villar, Schölkopf, and me during the meeting. That was fun. So now I have a new way of talking about all this stuff, and the three of us are trying to write a short paper about it.

Among the talks today, one idea I really liked is the idea, from Carl Henrik Ek, that trust and interpretability might be strongly related. Indeed, when I talk about interpretability, it is often in the same context that I am talking about models that make sense to a physicist, which are, in turn, models that I would trust. And that is also very related to what I myself talked about today: If models look more like physical law, then they are much more trustworthy. And maybe also more interptetable.


Dagstuhl, day 3

Today was day 3 of Machine Learning for Science: Bridging Data-driven and Mechanistic Modeling at Schloss Dagstuhl.

We had an open discussion about goals for ML in science today. The idea of explainability came up. I liked the comment that explainability (or what counts as explainability) might depend incredibly strongly on field or context. Like it is different in medicine and in astronomy. And, related, the idea of how models are communicated is very context dependent. And maybe very dependent on history. For example, in the future, models might be communicated through APIs rather than scientific papers maybe?

Causation and causal inference was a big theme of the day with Bernhard Schölkopf, Jonas Peters, Bubacar Bah, and Niki Kilbertus all talking about overlapping ideas in causal inference, mechanism inference, differential equation inference, and symbolic regression. Is causation the new framework for machine learning? Many in the room think so.


Dagstuhl, day 2

Today was day 2 of Machine Learning for Science: Bridging Data-driven and Mechanistic Modeling at Schloss Dagstuhl. Many great things happened. Here are two highlights:

Bernhard Schölkopf (MPI-IS), in a discussion session, asked what the key questions were for machine learning as a field. I love this question! Astronomy and physics do, I think, have key questions, which guide research and contextualize choices. Machine learning does not really, or if it does, the questions are implicit. I want to work on this.

Philipp Hennig (Tübingen) gave an energizing talk about the relationship between simulations of the world and observations of (or data about) the world. He argued (convincingly!) that we should not think of these as totally different things, and that learning from data and simulating a process could or even should always be integrated and done together. He demonstrated this with a simple model of infectious disease, but the point is extremely general.


Dagstuhl, day 1

Today was day 1 of Machine Learning for Science: Bridging Data-driven and Mechanistic Modeling at Schloss Dagstuhl. The first day was mainly about applications of machine learning, in Earth science, livestock management, astrophysics (dark matter), cells, and mechanical engineering. I had many thoughts and realizations. Here are a few random ones:

The problems that appear in Earth science, and the data types, are very similar to those that appear in astrophysics! But in Earth science, biology is a big driver of global processes, and there is no good mechanistic model for (say) how plants grow and take up carbon. The world is filled with mobile phones, with good cameras, and the methods we could could be employing to be doing science in a distributed way are way, way under-used. Cells are incredibly complicated. The mechanistic model involves literally thousands of individual processes. Like our model for the cell is as complicated as our model for the entire Earth system (which, by the way, depends on cells!), or even more complicated.

In the areas of the cell and the Earth, a theme was that the investigators want to preserve the causal structure we believe, and just use the machine learning to replace one tiny piece, with a data-driven model. Related: You can think of the machine learning as an effective theory for something (a sub-part of the problem) that doesn't work well from first principles. That's a good idea!


signal processing vs forward modeling

Abby Shaum (CUNY) and I are trying to write up a paper about our work treating oscillating stars as something like FM radios: We use the oscillation modes as carrier frequencies and find any orbital companions through phase or frequency variations of that carrier signal. Today we discussed the difference between doing that and forward modeling the signal. The former is signal processing. The latter is a generative model. Very different! And in many senses forward modeling is more principled. But I still think (and hope) that signal processing has a place in astronomy.


the chevron

A research highlight today was the Flatiron Galactic dynamics internal group meeting. We discussed kinematic features in the Milky Way halo that have appeared in ESA Gaia DR3 in maybe this paper. We looked at data and (toy) simulations. I'm interested in whether the features appear in metallicity or abundances. The arguments that Neige Frankel (CITA) and I worked out this summer for The Snail looks like they maybe work for all phase-space overdensities caused by perturbations?


stellar noise as a physical process

Today I was privileged to be part of a great and productive meeting between Jesse Cisewski (Wisconsin), Megan Bedell (Flatiron), and Lily Zhao (Flatiron) about noise sources in extreme-precision radial-velocity measurements. The conversation was inspired by the realization (obvious, really) that any physical effect on the surface of stars (spots, plages, convection pattern, p-modes, flares) that affect the radial-velocity measurement must (unless the Universe is truly adversarial) leave other imprints on the spectrum at the same signal-to-noise or even higher signal-to-noise. This means that any claim that RV measurements are affected by spots (say) should be backed up by an observation in the spectrum that is orthogonal to the RV signal that supports the claim. We discussed relevant research and decided to jointly read this paper before our next meeting!


simulating a patch of a spectrograph

In preparation for writing something (or proposing something, maybe?) about new methods for extracting spectra from spectrograph data, I wrote a tiny simulation code that makes fake spectroscopy data. The issue is that (except in rare circumstances) the spectral trace is not aligned perfectly with a CCD row (or column) and (except in rare circumstances) the cross-wavelength direction directions of constant wavelength) are not aligned perfectly with a CCD column (or row). How to adjust current methods to address this? I think I know! And I think it doesn't require a full instrument model.


the symmetries of the observed universe are different from the symmetries of the latent universe

Kate Storey-Fisher and I spent a long time today talking about how to build a project that is about cosmological observables, built from the concepts in her projects on applying coordinate-free geometric forms to theoretical objects in cosmology. The idea could be: Find geometric scalars that exist in the theoretical (or latent) universe, find geometric scalars that exist in any observational survey of the observable universe, and learn the relationships between these; construct cosmological tests and tests of the dark-matter model. The big issue (from my perspective) is that the symmetries that apply to the 6-dimensional phase space of the Universe are different from the symmetries that apply to the observed 3-dimensional redshift-and-angle-space of the (galaxy or quasar, say) observations. Some might say that there are no symmetries in this observational space, since there are window functions and selection functions, but this is not correct: Coordinate symmetries still exist there, it is just that these other functions must also be tracked, in the same space. Anyway, it's a nice research program to figure all this out.


regularities of dark-matter halos

There is a regular dynamics meeting (maybe Galactic dynamics meeting?) at Flatiron. I went today and I learned a lot, from Ivana Escala (Princeton) and Danny Horta-Darrington (Flatiron). I briefly presented Kate Storey-Fisher's project of describing dark-matter halos with coordinate-free nonlinear geometric scalars, which isn't really a dynamics project but it could be, because these scalars could be part of a canonical transformation of the dark sector. Anyway, the crowd had interesting things to say. In particular, the idea came up that the subspace in which the dark-matter halos live (subspace of the space of these scalars) is likely to be very compact (or low-dimensional, or both) and that the susbspace probably depends on the dark-matter model. That's a great idea, and suggests that maybe we can construct new tests of gravity.



continuous representations of stellar spectra

Matt Daunt (NYU) and I had lunch today and discussed various things. One is the idea that we need (for technical reasons of data analysis) to make and use continuous representations of spectra. That is, we need to represent our spectra of stars such that they can be exactly and losslessly interpolated to any (sufficiently fine) grid of points. The ESA Gaia Mission XP spectra have this property: They are represented as polynomial basis functions, which confuse and surprise everyone. They are hard! But there are many other choices. For example, if the spectra are represented with b-spline basis coefficients, the coefficients “look like” just a set of flux values at wavelengths (so traditionalists are happy), but in fact they are the parameters of a continuous model that can be interpolated losslessly to any grid.


argh new writing projects?

Oh no! I spent the weekend accidentally starting new writing projects. What's wrong with me? One of the things that's wrong with me is that I am about to start teaching a new PhD-level class Statistics and Data Science for Physics and I find that I don't have good reading materials for the students. Here's a lack: A good, sensible discussion of when to take a frequentist approach in your data analysis, and when to take a Bayesian approach, divorced from (or not emphasizing) the philosophical differences.


a gallery of tensor images

I spent some time working on a possible introduction figure for the paper I am writing with Soledad Villar on images and grids and lattices of geometric objects (like scalars, vectors, and tensors). This introduction figure would give a set of examples of different kinds of data that come up in natural-science contexts. This is all a great idea! But then I need to understand (and explain!) exactly what each image in the gallery shows, and also get permissions to republish. Worth it (I hope).


a reference implementation of The Cannon

I had a great conversation with Andy Casey (Monash) today about many things. Hopefully it is the start of a regular. We discussed making a reference version of The Cannon, which would make use of jax, which I love, and which could be an affiliated package for astropy. I want this because (a) there is no completely simple, completely robust implementation out there, and (b) I want to transfer labels to all of the ESA Gaia RVS spectra from the SDSS-IV APOGEE data.