Because of various confluences, I spent my entire day on teaching, no research.
With Suroor Gandhi (NYU) and Adrian Price-Whelan (Flatiron) we have been able to formulate (we think) some questions about unseen gravitational matter (dark matter and unmapped stars and gas) in the Milky Way into questions about transformations that map one set of points onto another set of points. How, you might ask? By thinking about dynamical processes that set up point distributions in phase space.
Being physicists, we figured that we can do this all ourselves! And being Bayesians, we reached for probabilistic methods. Like: Build a kernel density estimate on one set of points and maximize the likelihood given the other set of points and the transformation. That's great! But it has high computational complexity, and it is slow to compute. But for our purposes, we don't need this to be a likelihood, so we found out (through Soledad Villar, NYU) about optimal transport
Despite its name, optimal transport is about solving problems of this type (find transformations that match point sets) with fast, good algorithms. The optimal-transport setting brings a clever objective function (that looks like earth-mover distance) and a high-performance tailored algorithm to match (that looks like linear programming). I don't understand any of this yet, but Math may have just saved our day. I hope I have said here recently how valuable it is to talk out problems with applied mathematicians!
I got in some great research time late today working with Adrian Price-Whelan (Flatiron) to understand the morphology of the distribution of stars in APOGEE–Gaia in elements-energy space. The element abundances we are looking at are [Fe/H] and [alpha/Fe]. The energy we are looking at is vertical energy (as in something like the vertical action in the Milky Way disk). We are trying to execute our project called Chemical Tangents, in which we use the element abundances to find the orbit structure of the Galaxy. We have arguments that this will be more informative than doing Jeans models or other equilibrium models. But we want to demonstrate that this semester.
There are many issues! The issue we worked on today is how to model the abundance space. In principle we can construct a model that uses any statistics we like of the abundances. But we want to choose our form and parameterization with the distribution (and its dependence on energy of course) in mind. We ended our session leaning towards some kind of mixture model, where the dominant information will come from the mixture amplitudes. But going against all this is that we would like to be doing a project that is simple! When Price-Whelan and I get together, things tend to get a little baroque if you know what I mean?
I spent my research time today writing notes on paper and then LaTeX in a document, making more specific plans for the projects we discussed yesterday with Zhao (Yale) and Bedell (Flatiron). Zhao also showed me issues with EXPRES wavelength calibration (at the small-fraction-of-a-pixel level). I opined that it might have to do with pixel-size issues. If this is true, then it should appear in the flat-field. We discussed how we might see it in the data.
Today I had a great conversation with Lily Zhao (Yale) and Megan Bedell (Flatiron) about Zhao's projects for the semester at Flatiron that she is starting this moth. We have projects together in spectrograph calibration, radial-velocity measurement, and time-variability of stellar spectra. On that last part, we have various ideas about how to see the various kinds of variability we expect in the joint domain of wavelength and time. And since we have a data-driven model (wobble) for stellar spectra under the assumption that there is no time variability, we can look for the things we seek in the residuals (in the data space) away from that time-independent model. We talked about what might be the lowest hanging fruit and settled on p-mode oscillations, which induce radial-velocity variations but also brightness and temperature variations. I hope this works!
I spoke with Christina Eilers (MPIA) early yesterday about a possible self-calibration project, for stellar element abundance measurements. The idea is: We have noisy element-abundance measurements, and we think they may be contaminated by biases as a function of stellar brightness, temperature, surface gravity, dust extinction, and so on. That is, we don't think the abundance measurements are purely measurements of the relevant abundances. So we have formulated an approach to solve this problem in which we regress the abundances against things we think should predict abundances (like position in the Galaxy) and also against things we think should not predict abundances (like apparent magnitude). This should deliver the most precise maps of the abundance variations in the Galaxy but also deliver improved measurements, since we will know what spurious signals are contaminating the measurements. I wrote words in a LaTeX document about all this today, in preparation for launching a project.
Today I got in my first weekly meeting (of the new academic year) with Kate Storey-Fisher (NYU). We went through priorities and then spoke about the problem of performing some kind of comprehensive or complete search of the large-scale structure data for anomalies. One option (popular these days) is to train a machine-learning method to recognize what's ordinary and then ask it to classify non-ordinary structures as anomalies. This is a great idea! But it has the problem that, at the end of the day, you don't know how many hypotheses you have tested. If you find a few-sigma anomaly, that isn't surprising if you have looked in many thousands of possible “places”. It is surprising if you have only looked in a few. So I am looking for comprehensive approaches where we can pre-register an enumerated list of tests we are going to do, but to have that list of tests be exceedingly long (like machine-generated). This is turning out to be a hard problem.
The New York City physics and astronomy departments (and this includes at least Columbia, NYU, CUNY, AMNH, and Flatiron) run a set of three Friday events in which everyone (well a large fraction of everyone) presents a brief talk about who they are and what they do. The first event was today.
I re-derived equation (11) in our paper on The Joker, in order to answer some of the questions I posed yesterday. I find that the paper does have a sign error, although I am pretty sure that the code (based on the paper) does not have a sign error. I also found that I could generalize the equation to apply to a wider range of cases, which makes me think that we should either write an updated paper or at least include the math, re-written, in our next paper (which will be on the SDSS-IV APOGEE2 DR16 data).
This morning, Adrian Price-Whelan proposed that we might have a sign error in equation (11) in our paper on The Joker. I think we do, on very general grounds. But we have to sit down and re-do some math to check it. This all came up in the context that we are surprised about some of the results of the orbit fitting that The Joker does. In a nutshell: Even when a stellar radial-velocity signal is consistent with no radial-velocity trends (no companions), The Joker doesn't permit or admit many solutions that are extremely long-period. We can't tell whether this is expected behavior, and we are just not smart enough to expect it correctly, or whether this is unexpected behavior because our code has a bug. Hilarious! And sad, in a way. Math is hard. And inference is hard.
One of my projects this Fall (with Soledad Villar) is to show that large classes of machine-learning methods used in astronomy are susceptible to adversarial attacks, while others are not. This relates to things like the over-fitting, generalizability, and interpretability of the different kinds of methods. Now what would constitute a good adversarial example for astronomy? One would be classification of galaxy images into elliptical and spiral, say. But I don't actually think that is a very good use of machine learning in astronomy! A better use of machine learning is converting stellar spectra into temperatures, surface gravities, and chemical abundances.
If we work in this domain, we have two challenges. The first is to re-write the concept of an adversarial attack in terms of a regression (most of the literature is about classification). And the second is to define large families of directions in the data space that are not possibly of physical importance, so that we have some kind of algorithmic definition of adversarial. The issue is: Most of these attacks in machine-learning depend on a very heuristic idea of what's what: The authors look at the images and say “yikes”. But we want to find these attacks more-or-less algorithmically. I have ideas (like capitalizing on either the bandwidth of the spectrograph or else the continuum parts of the spectra), but I'd like to have more of a theory for this.
The self-calibration idea is extremely powerful. There are many ways to describe it, but one is that you can exploit your beliefs about causal structure to work out which trends in your data are real, and which are spurious from, say, calibration issues. For example, if you know that there is a set of stars that don't vary much over time, the differences you see in their magnitudes on repeat observations probably have more to do with throughput variations in your system than real changes to the stars. And your confidence is even greater if you can see the variation correlate with airmass! This was the basis of the photometric calibration (that I helped design and build) of the Sloan Digital Sky Survey imaging, and similar arguments have underpinned self-calibrations of cosmic microwave background data, radio-telescope atmospheric phase shifts, and Kepler light curves, among many other things.
The idea I worked on today relates to stellar abundance measurements. When we measure stars, we want to determine absolute abundances (or abundances relative to the Sun, say). We want these abundances to be consistent across stars, even when those stars have atmospheres at very different temperatures and surface gravities. Up to now, most calibration has been at the level of checking that clusters (particularly open clusters) show consistent abundances across the color–magnitude diagram. But we know that the abundance distribution in the Galaxy ought to depend strongly on actions, weakly on angles, and essentially not at all (with some interesting exceptions) on stellar temperature, nor surface gravity, nor which instrument or fiber took the spectrum. So we are all set to do a self-calibration! I wrote a few words about that today, in preparation for an attempt.
Mattias Samland (MPIA), as part of his PhD dissertation, adapted the CPM model we built to calibrate (and image-difference) Kepler and TESS imaging to operate on direct imaging of exoplanets. The idea is that the direct imaging is taken over time, and speckles move around. They move around continuously and coherently, so a data-driven model can capture them, and distinguish them from a planet signal. (The word "causal" is the C in CPM, because it is about the differences between how systematics and real signals present themselves in the data.) There is lots of work in this area (including my own), but it tends to make use of the spatial (and wavelength) rather than temporal coherence. The CPM is all about time. It turns out this works extremely well; Samland's adaptation of CPM looks like it outperforms spatial methods, especially at small “working angles” (near the nulled star; this is coronography!).
But of course a model that uses the temporal coherence but ignores the spatial and wavelength coherence of the speckles cannot be the best model! There is coherence in all four directions (time, two angles, and wavelength) and so a really good speckle model must be possible. That's a great thing to work on in the next few years, especially with the growing importance of coronographs at ground-based and space-based observatories, now and in the future. Samland and I discussed all this, and specifics of the paper he is nearly ready to submit.