degeneracies and optimization

With Emily Griffith (Colorado) I have been working on a purely data-driven nucleosynthetic model, trained on the abundances measured in stars by the APOGEE surveys. This model looks a lot like a non-negative matrix factorization, so it is a kind of model I've worked with many times in my life. We've figured out an optimization scheme and made it (exceedingly) fast with jax. Nonetheless, we have been having troubles with the optimization, getting stuck in bad local minima or even pathological locations in parameter space.

Today I discussed this model with Soledad Villar (JHU) who warned me that the model has potential pathologies, and strong degeneracies. I thought I was breaking these degeneracies with regularizations, but in fact the degeneracies are bigger than I thought. Villar's advice (which aligns with the machine learning zeitgeist) was to leave the degeneracies free and then rotate or transform the model to where I want it to be at the end. She also had useful advice about optimizing non-convex functions.


Iron Snail (tm)

My day started (at 07:00) with a call with Neige Frankel (CITA) and Scott Tremaine (IAS) about our project to understand the phase-space spiral in the vertical kinematics of the disk in terms of metallicity, element abundances, stellar ages, and so on. Indeed, we have a general argument that any non-equilibrium perturbation of the Galaxy, winding up into a spiral, will show a metallicity (or other stellar-label) effect, provided that there were gradients in the metallicity (or other label) with respect to stellar density, or phase-space density, or orbital actions. The argument is exceedingly general; I want to write a paper with wide scope. Tremaine is careful with his conclusions; he wants to write a paper with narrow scope. We argued. The data (compiled and visualized very cleverly by Frankel) are beautiful.


halo mass assembly

On Fridays, Kate Storey-Fisher (NYU) organizes a small meeting to discuss her projects on dark-matter halos using equivariant scalar objects constructed from n-body simulation outputs. Today we included Yongseok Jo (Flatiron), who has worked on building tools to paint galaxies onto dark-matter-only n-body simulations. We discussed joint projects, and conceptual issues about mass-assembly histories. In particular, I am interested in how we can predict formation histories of dark-matter halos from the galaxy contents alone, or infer the dark matter distribution in phase space from the stellar distribution in phase space. I love these projects, because they combine growth of structure, gravitational dynamics, galaxy formation, and machine learning.


non-parametric model of the density of the Milky Way disk?

Danny Horta-Darrington (Flatiron) has been working with Adrian Price-Whelan (Flatiron) to measure things about abundances and dynamics of stars in the Milky Way disk. Horta is finding that there are way better abundance gradients, in way more directions in phase space, than previously have been (usefully) visualized. But along the way, he stumbled upon a plot that clearly shows the variation of the Milky Way thin disk density with radius. We discussed today how to make the simplest possible measurement of this, with a variation of Orbital Torus Imaging, or really a simplification of it. We realized today that there is enough data to just make this measurement in patches all over the (nearby) disk. The scale length looks short!


parameters and nuisance parameters

Long ago, Adrian Price-Whelan (Flatiron) and I and others built The Joker, which is a Monte Carlo method (but not a MCMC method) for dealing with the Kepler problem. It exploits the fact that some parameters are linear, and some are nonlinear. This week, Lawrence Peirson (Stanford) is visiting Flatiron to generalize this point. Peirson's point is that the trick we use for linear parameters can be used for any parameters that have smooth, unimodal-ish posteriors. We just have to add some linearization and some optimization. So we are working on writing that down. And coding it up.

Along the way, Peirson found another linear parameter in The Joker, so we can now make it way, way faster. That's awesome!


dust and star formation

Julianne Dalcanton (Flatiron) gave a great talk at NYU today about star formation, interstellar medium, stellar ages, and dust in Local Group galaxies. She showed that the standard star-formation indicators from infrared emission from dust are way wrong. But she also showed lots of interesting detail in the interstellar medium and star-formation history in M33 and M31. M31 really does seem to have a ring which is not just over-dense in star formation; it's actually over-dense in stars. That's odd, and interesting.


Ising model and gauge

I understood cool things about gauge freedom today, during a beautiful blackboard talk by Himanshu Khanchandani (NYU), who was talking about the 2-d Ising model and how it relates to the continuum limit (which is a field theory, interestingly!). He showed that if you introduce certain kinds of linear defects into the lattice, the change to the Hamiltonian depends only on the locations of the endpoints of the line of linear defects. This is because there is a gauge freedom, which is that you can change the signs of the spin-spin interactions at a point, and also change the labeling of what constitutes the positive and negative local state. This leads to topological properties of defects. It's gorgeous! And maybe related to the problems we want to solve in machine learning with images and geometry.



Today Andy Casey (Monash) joined a regular meeting I have with Megan Bedell (Flatiron) and Lily Zhao (Flatiron) about things related to precision spectroscopy. We discussed projects we can do with surface spectra of the Sun, one from the quiet part, and one from a spot. Casey is involved in the Korg project led by Adam Wheeler (OSU); we discussed fitting both spectra with Korg, and learning about the physical differences between the quiet and active regions in the Sun. We also discussed Zhao's projects to empirically correct for stellar activity in time-domain spectroscopy looking for planets.


JWST and open science

Today I hosted Sarah Kendrew (STScI) at NYU. She gave the Physics Colloquium, about NASA JWST launch, commissioning, and early science. She has been the lead of a JWST instrument mode for something like 14 years; now she has data! She talked about how JWST works and showed some beautiful exoplanet results. One of the great things about her talk is that she explained a point on which they made some mistakes, and how interactions with the user community helped them to fix those mistakes. It was a great endorsement of the open model for science.


what is the scope of our image models?

There is a bit of a disagreement between Soledad Villar (JHU) and me on the scope of the methods that we are building to operate on images of scalars, vectors, and tensors. Soledad's view is that they apply to physics problems, like fluids. Mine is that they apply to absolutely every image of every kind ever taken, like vacation snapshots. Today we had a meeting with Drummond Fielding (Flatiron) and Wilson Gregory (JHU) about making some training data from a small 2D fluids simulation. (That is, we were adopting, for today, Villar's position on our scope.) Apparently 2D fluids is a standard problem in machine learning these days? I can't imagine why. But anyway, on the call, Fielding promised to make us some toy data. And, tonight, he did. Awesome!


bi-linear-ish models

Before my day got ruined by a deadly SDSS-V Advisory Council meeting, I worked with Emily Griffith (Colorado) on a data-driven 2-process model for nucleosynthesis. This model is amplitudes (2 per star) times process yield vectors (2 per element). In this sense it is like a matrix factorization. But it involves a log-sum-exp (rather than just a matrix multiply), so it is mildly nonlinear. It can still be optimized the same way and it is well behaved. In some sense, I realized, it is very like The Cannon in form. But different! We failed to fully implement before I turned into a (very unhappy) technocrat.


SDSS-V Science Festival, day 2

On day two of the Collaboration meeting, I talked to Emily Griffith (Colorado) about data-driven models of nucleosynthetic processes. We were inspired by this paper, on which Griffith is an author. The paper builds an empirical two-process enrichment model based on the observed morphology of the [Fe/Mg] vs [Mg/H] plane. We discussed how to make this model into a full (but constrained) latent-variable model. I am interested in moving it towards causal inference, but we could also look at third processes, anomalous stars, anomalous elements, calibration issues, and so on. We wrote down math and started to write code.


SDSS-V Science Festival, day 1

Today was the first day of the SDSS-V Collaboration Meeting in Toronto. We talked about the state of the survey and the survey mission, shared values, and operating principles. This was great; it is the first full in-person meeting since survey start. Much of the day was open working and break-out time.

Late in the day, Adam Wheeler (OSU) made a great plot comparing SDSS-IV velocities (Doppler shifts) to ESA Gaia velocities, as a function of APOGEE fiber. It looks like there are substantial differences, and systematic with fiber. If this is real, fixing it will have a big impact on work I've done on spectroscopic binaries in the sample.



I had a great visit to UCLA Astronomy today. I learned a ton. I gave a messy, disorganized talk about machine learning. I learned, as I often do, that it is not a good idea to try to tweak an old talk into a new talk. The best move is to start from scratch and make new slides. It's fresher. And better! And more aligned with the new me. But I at least started some conversation (it is an engaged audience there). Giving talks is hard.