other kinds of sailing

In finishing up the first draft (yay!) of my paper on sailing, I thought about other kinds of sailing (for the discussion). One is solar sails: In principle if a spacecraft has a sufficiently large solar sail, of which it can change the size and shape, the spacecraft can navigate in arbitrary directions and perform arbitrary three-axis attitude adjustments, by working on a combination of radiation pressure and gravity. It's really very flexible. It makes me want to design a spacecraft!

I also thought about ice boats. An ice boat is like a sailboat with an extremely large keel and almost no water drag. That is, it sails like a sailboat in the limit that the keel gets large but the water drag gets small. This should make ice boats extremely fast at upwind sailing. Maybe I'll try to find an opportunity to sail on ice this winter?



Right now, in cosmology, emulators are all the rage: Cosmological simulations are slow and expensive, and we need to speed them up! But I'm concerned: What if we use emulators in all our Euclid and LSST pipelines, and then we find something very surprising? What then? Do we check it all using full-blown simulations? But then, if so, why not do full-blown from the outset? Or do our standards of care depend on our results? They aren't supposed to, but maybe they do?

Worried about all this, I have been thinking about how to make emulators better, and give them provably better properties. It's why I've been working on building methods that exactly obey symmetries. But from an epistemological perspective, emulators might be the Worst Idea EverTM.

In detail, I'm working out how we might build emulators that are trained on one simulation suite but can be tested on another, made with a different cosmology and at a different mass resolution. I think work I've been doing with Villar (JHU) on the exact symmetries imposed by geometry and units might make this possible. I'm trying to figure out the scope of an achievable paper on this.


sailing upwind; lift ratios; sail-to-keel ratio

In writing up a description of some of my physics-of-sailing results, I realized some things about sailboat design. For deep reasons you want the ratio of the area of the sail to the area of the keel to be roughly the ratio of the density of water to the density of air, or 700. That flows from the point that the sail and the keel have symmetric roles in sailing. But if you set this ratio to 700, you can only sail upwind if the sail lift ratio (the ratio of useful sail force to drag force) is very high. Since it is hard to make this lift ratio high on a commercial boat, the alternative is to make the sail smaller (relative to the keel). Looking at the data I can find on real sailboats, most have okay lift ratios but small sail-to-keel ratios (smaller than 700 anyway), so that they can sail quickly upwind. The cost of these design choices is that you can't go downwind faster than the wind. If you want to be able to sail both downwind faster than the wind—and also upwind—you have to have amazing sail and keel lift ratios.


steerable machine learning

Inspired by the hope of mashing up our paper on scalars and vectors in machine learning with work like this on steerable neural networks, I worked with Soledad Villar (JHU) today on writing down all possible linear convolutional kernels in a 3x3x3 block of a 3-d image that satisfy the geometric symmetries of scalar, vector, and second-order tensor forms. I feel like recent work on machine-learning in cosmology like this could be vastly improved by these geometric methods.


modeling arid ecologies?

[This blog died for a few months. I apologize. I am back now, and hope to continue.]

Today I worked with Soledad Villar (JHU) on our project to execute regressions (and other machine-learning methods) that are constrained to be exactly symmetric (or equivariant) with respect to units and dimensions. Our target problem is a regression involving ecologies of arid regions. There are differential-equation models for this, and all the inputs to the equations have interesting units (such as water volume per area, and grams of vegitation, and so on). It was fun to do some real coding again, after days of grading final exams!


radial velocity with a gas cell

I had a great meeting today with Matt Daunt (NYU), Lily Zhao (Flatiron), and Megan Bedell (Flatiron), in which we described what we need to do to fit a data-driven model to extreme-precision radial-velocity spectra that are taken with a gas cell in the light path. The gas cell imprints lines from known atomic transitions and secures the wavelength calibration of the device. How to use this in a data-driven model. We started by talking about line-spread functions and wavelength solutions and ended up asking some deep questions, like: Do you really need to know the wavelength solution in order to measure a change in radial velocity? I am not sure you do!


Dr Steven Mohammed

Today I had the great pleasure to serve on the PhD defense committee for Steven Mohammed (Columbia). Mohammed worked on the post-GALEX-misson GALEX data in the Milky Way plane. We took the data specially as part of the Caltech/Columbia takeover of the mission at the end of the NASA mission lifetime. Mohammed and my former student Dun Wang built a pipeline to reduce the data and produce catalogs. And Mohammed has done work on photometric metallicities and the abundances in the Milky Way disk using the data. It is a beautiful data set and a beautiful set of results.


does the Milky Way bar have distinct chemistry?

The answer to the post title question is NO, it doesn't. Eilers (MIT), Rix (MPIA), and I discussed Eilers's project on this subject today. If you use the APOGEE data naively, you can conclude that the bar has different abundances. But your conclusion is wrong, because the abundances have surface-gravity-related systematic issues, and the Galactic center is only visible in the most luminous (lowest surface-gravity) stars. So we have corrected that and have a paper to write. Even after correcting for this systematic, there are still interesting systematics, and we don't know what they are, yet.


making age estimates out of abundances

Today Trevor David (Flatiron) showed me some amazingly precise and detailed dependences of abundance ratios on stellar age. The idea is: Different stars formed in different moments in the chemical-enrichment history of the Milky Way, and so the abundance ratios give the stellar ages in detail. The abundances he has are from Brewer and Fischer, and the ages he has are from various sources, including especially isochrone ages. We discussed the following problem:

Given that you don't believe any age estimates in detail, and given that any abundance measurements and age estimates are noisy and biased, what is the best way to build usable abundance–age relationships that can be used as alternative clocks (alternative to isochrones, stellar rotation, asteroseismology, and C–N dredge-up, for examples)? We settled on a few ideas, most of which involve building a low-dimensional hypersurface in the space of abundances and age, and then fitting for adjustments or corrections to different age systems.


specification of a possible cosmology project in terms of spherical harmonics

Today Kate Storey-Fisher (NYU) and I asked ourselves: How can we do machine learning on cosmological simulations? If we want to use permutation-invariant methods, we need to use something like graph structure, and you can't currently execute graph neural networks at the scale of millions or billions of particles. So we need permutation-invariant scalars and vectors produced from point clouds. We discussed a set of options in terms of taking spherical-harmonic transforms in shells, and then combining those into scalars and vectors. There's a lot of fun geometry involved! Not sure if we have a plan, yet.


how to run community-building meetings

Rodrigo Luger (Flatiron) convened a lunch meeting today to discuss the big weekly community meetings (themed around stars and exoplanets) that we run here in New York City: How to manage them so we have a culture of informal scientific interaction, community building, learning, and constructive contribution to each others' research programs. We had a wide-ranging conversation and came up with some ideas for this new semester. We like to learn about (and contribute to) ideas, concepts, and methods. We are less interested in finished, polished talks or presentations. So we might move to a mode where you speak a bit about your project, but you aren't permitted to give your results! That might make it a meeting of scientific introductions and reviews? Worth a try. One of the big themes, of course: How to adapt to the hybrid world we live in now, where things are neither fully in-person nor fully remote.



Today was a teaching-and-life day. Not much research got done!


a conditional form of domain adaptation?

I have been discussing with Soledad Villar (JHU) projects related to domain adaptation (also with Thabo Samakhoana and Katherine Alsfelder), in which you have two different instruments (say) taking data and you want to find the transformation between the instruments such that the data are the same from the two sources. The modification we want (or need) to make is to create a conditional version of this: In SDSS-V, there are two observatories; they take very similar (but not identical) data. What are the transformations that make the data identical? The problem is: The two observatories also observe different stars on average (because they see different parts of the Galaxy). So we need to find the transformations that make the data identical, conditional on other data (like the ESA Gaia data) that we have for the stars. Great problem, and we came up with some non-elegant solutions. Are there also elegant solutions?


awakening old projects

Megan Bedell (Flatiron) and I discussed what it would take to resurrect and finish one of our unfinished papers today. Not much, we think! But it's noticeable that at the end of this pandemic I have at least six papers that are 75-percent done.


new cosmological tests with LIGO

Kate Storey-Fisher (NYU) and I had a wide-ranging conversation with Will Farr (Flatiron) about uses of LIGO sources for cosmological studies. We all have the intuition that there is lots of space in the literature for new approaches, but we can't quite figure out where to position ourselves. I like the idea of going fully frequentist, because most of the competition is purely Bayesian. The ideas I like best so far involve something like correlations between LIGO error boxes and large-scale structure surveys.


orbital torus imaging: next steps

Price-Whelan (Flatiron), Rix (MPIA), and I met this morning to discuss the next projects we might do with our orbital torus imaging method for measuring the force law in the Galaxy. The method has many information-theoretic advantages over classical methods like Jeans modeling, but it has one very big disadvantage: We need to be able to compute actions and angles in any of the gravitational potentials we consider, and that means integrable, and that means restricted potential families. We'd like to be doing something way more non-parametric. Rix is proposing (and we are accepting) that we do OTI in a restricted family, but then dice the data by position in the Milky Way, and see if the results are interpretable in terms of a more general mass distribution.


geometry of density derivatives

Gaby Contardo (Flatiron) and I discussed our project on voids and gaps in data. We have to write! That means that we have to put into words things we have been doing in code. One of our realizations is that there is a very useful statistic, which is the largest eigenvalue of the second derivative tensor of the density (estimate), projected onto the subspace orthogonal to the gradient (the first derivative). It's obvious that this is a good idea. But it isn't obvious why or how to explain to others why this is a good idea. Contardo assigned me this as homework this week.


what is gauge symmetry?

After we posted this paper on gauge-equivariant methods for machine learning, we got referee comments that maybe what we are doing isn't gauge. So we spent a lot of time working on gauge! What we are doing can be gauge, but there are additional things to add to make that clear. We are clarifying, but maybe for a next contribution in which we really do full gauge invariance, and things like parallel transport.


finishing the response to referee and adjusting the paper.

As is usual with Publications of the Astronomical Society of the Pacific (great journal!), Soledad Villar and I got a constructive and useful referee report on our fitting paper. We finished our comments and adjustments to the paper today. The referee made an excellent point, which is: Since there are fast Gaussian process codes out there, why ever do interpolation or flexible fitting any other way? Good question! We answered it in the new revision (because sometimes fast GPs don't exist, and sometimes you don't want a stationary process and sometimes you are in a weird geometry or space), which we will update on arXiv soon.


what are the assumptions underlying EPRV?

I reopened an old paper started (many years ago now) by Megan Bedell (Flatiron) and me, about the precision possible in extreme precision radial-velocity spectroscopy. Most of the results in the literature on how precisely you can measure a radial velocity (the information-theoretic question) depend on a very large number of assumptions, which we try to enumerate. The thing we'd like to do with this paper (and frankly, it will take many more after it) is to weaken or break those assumptions and see what gives. I have an intuition that if we understand all of that information theory, it will help us with observation planning and data analysis.


preparing presentations

Adrian Price-Whelan (Flatiron), Emily Cunningham (Flatiron), and I met with Micah Oeur (Merced) and Juan Guerra (Yale) to go through their ideas about how to present their results at the wrap-up of the summer school on dynamics next week. We discussed lots of good ideas about how to do short presentations, how to react to other presentations in the same session, how to read the audience, and so on. Someone should write a book about this!


every Keck/DEIMOS star, ever!

I met briefly this morning with Marla Geha (Yale). She is completing an impressive project, in which she has re-reduced, from raw data, every (or nearly every) Keck/DEIMOS spectrum of a star in the Milky Way or Local Group. These include ultra-faint dwarfs, classical dwarfs, globular clusters, halo, disk, and so on. It is an absolute goldmine of science. We spent time talking about the technicals, since she has done a lot of creative and statistically righteous things in this project (which is built on the exciting new open-source project PypeIt). But we also dreamed about a lot of science that we could be doing with these data. It will be of order 105 stars.


predicting wavelength calibration from housekeeping data

I did some of my favorite thing today, which is fitting flexible models. The context was my attempt to predict the wavelength solution for the SDSS-IV BOSS spectrographs using only the housekeeping data, like the state of the telescope, temperatures, and so on. It doesn't work accurately enough, or at least not with the housekeeping data I've tried so far. It looks like there might be a hysteresis or a clank or something like that. If this is right, it bodes poorly for reducing the number of arcs we need to take in SDSS-V, which is supposed to move fast and not break things.

But all that said, I still have one card left to play, which is to see if we can look at sky lines in science frames and learn enough from sky lines—plus the historical behavior of all arcs ever taken—such that we can lock everything else down without taking an arc for every visit.


scalings for different methods of inference

Excellent long check-in meeting with Micah Oeur (Merced) and Juan Guerra (Yale) about their summer-school projects with me and Adrian Price-Whelan (Flatiron). The projects are all performing inferences on toy data sets, where we have fake observations of a very simple dynamical system and we try to infer the parameters of that dynamical system. We are using virial theorem, Jeans modeling, Schwarzschild modeling, full forward modeling of the kinematics, and orbital torus imaging. We have results for many of these methods already (go team!) and more to come. Today we discussed the problem of measuring scalings of the inferences (the uncertainties, say) as a function of the number of observations and the quality of the data. Do they all scale the same? We also want to check sensitivity to selection effects, and wrongnesses of the assumptions.


Dr Dou Liu

Today I had the pleasure of sitting on the PhD defense of Dou Liu (NYU), who has been working on AGN in the centers of galaxies, using MANGA data from SDSS-IV. The part of Liu's thesis that is most exciting to me (perhaps not surprisingly) is the technical chapter, in which he finds a new method for combining irregularly dithered integral-field-unit spectroscopy exposures into a full data cube, with sky coordinates on two axes, and wavelength on the third. In this final data cube, his method gets much better final resolution, and lower pixel-to-pixel covariances in the noise, relative to the standard pipelines. His trick? He has generalized spectro-perfectionism (invented for spectral extraction) to the multi-dimensional spectral domain. It's beautiful stuff, and has implications for all sorts of imaging and spectroscopy projects going forward. Congratulations Dr. Liu and thank you!


my own special FMM algorithm

I was in a location with no internet and no computing, so I spent an hour or so writing down what I think is the fast multipole method. It involves building an octtree, balanced in volume (not necessarily point content), and computing recursively the multipole amplitudes in all nodes (starting with the points in the leaves). Once that is done, at evaluation time, you do different things at different levels, depending on the radius out to which things are computed exactly. One thing I'm interested in is: Can you simplify that evaluation if you, say, build multiple trees?


exoplanet atmospheres projects

I had a nice conversation today with Laura Kreidberg (MPIA) and Jason Dittmann (MPIA) about projects in ultra-precise spectroscopy for exoplanet-atmosphere science. I was pushing for projects in which we get closer to the metal (moving spectroscopy to two dimensions), and in which we use theoretical ideas to improve extraction of very weak spectral signals. Kreidberg was pushing for projects in which we use planet kinematics to separate very weak planet signals from star signals. On the latter, I recommended starting with co-ads of residuals.


units symmetry and dimensionless scalar quantities

I got in some quality time with Bernhard Schölkopf (MPI-IS) this weekend, in which we discussed many things related to contemporary machine learning and its overlap with physics. In particular, we discussed the point that the nonlinearities in machine-learning methods (like deep learning) have to be interpretable as nonlinear functions of dimensionless scalars, if the methods are going to be like laws of physics. That is a strong constraint on what can be put into nonlinear functions. We understood part of that in our recent paper, but not all of it. In particular, thinking about units or dimensions in machine learning might be extremely valuable.


band-limited image models

I had my quasi-weekly call with Dustin Lang (Perimeter) today, in which we discussed how to represent the differences between images, or represent models of images. I am obsessing about how to model distortions of spectrograph calibration images (like arcs and flats), and Lang and I endlessly discuss difference imaging to discover time-variable phenomena and moving objects. One thing Lang has observed is that sometimes a parametric model of a reference image makes a better reference image for image differencing than the reference image itself. Presumably this is because the model de-noises the data? But if so, could we just use a band-limited non-parametric model say? We discussed many things related to all that.


Gaia BP-RP spectral fitting

Hans-Walter Rix and I discussed with Rene Andrae (MPIA) and Morgan Fouesneau (MPIA) how they are fitting the (not yet public) BP-RP spectra from the ESA Gaia mission. The rules are that they can only use certain kinds of models and they can only use Gaia data, nothing external to Gaia. It's a real challenge! We discussed methods and ideas for verifying and testing results.


more MPIA dust mapping

At Milky Way Group Meeting at MPIA, Thavisha Dharmawareda (MPIA) showed her first results on building pieces of a three-dimensional dust map from observed extinctions/attenuations to stars. As usual, the problem is to infer a three-dimensional map, preferably non-parametrically, from measurements of line-of-sight integrals through the map. She uses a Gaussian process, variational inference, and inducing points. She has some nice features in her maps (she started with star-formation regions with interesting morphologies). She sees extinction-law variations too; we discussed how those might be incorporated.


latent-variable model for black-hole masses

This morning Christina Eilers showed me that a Gaussian-process latent-variable model seems to be able to predict quasar spectra and black-hole masses, such that she can perform inferences to learn the black-hole masses just from the bolometric luminosities and the spectral shapes redward of Lyman alpha.


representing spectrograph calibration images

In a spectrograph, many of the the raw calibration images are taken with something like arcs or flat-field lamps illuminating the slits or fibers, so that the spectral traces are lit up with calibration light. If the spectrograph flexes or changes, these traces move in two dimensions on the spectrograph image plane, or relative to the detector pixels. I am looking at whether we could have a generative model for these calibration images, so I spent the weekend toying with ways to represent the relevant kinds of small distortions of images. If the distortions are extremely small (as they are, say, for EXPRES) they can be represented with just a compact linear basis. But if they are a few pixels (as they appear to be for BOSS), I need a better representation.


talk on instrument calibration

I spent most of the day preparing this talk on instrument calibration, which I gave in the Königstuhl Colloquium this afternoon. I got excellent questions and I actually enjoyed it, despite the zoom/hybrid format, which doesn't always work well.


maps of Hessian eigenvalues for gap-finding

As my loyal reader knows, Gaby Contardo (Flatiron) and I have been looking for gaps (valleys, voids) in point clouds using geometric methods on density estimates. Today she just did the very simplest thing of estimating the largest eigenvalue of the second-derivative tensor (Hessian of density with respect to position), and visualizing it for different density estimates (different bandwidths) and different bootstrap resamplings of the data. It is obvious, looking at these plots, that we can combine these maps into good gap-finders! This is simpler than our previous approaches, and will generalize better to higher dimensions. It's also slow, but we don't see anything that can be fast, especially in “high” dimensions (high like 3 or 4!!).


a sandbox for Milky Way dynamics

Today I met with Micah Oeur (Merced) and Juan Guerra (Yale) to discuss the project they are doing as part of the Flatiron summer school on stellar dynamics. Their project is to make a sandbox for testing different methods for inferring force laws (or gravitational potentials or mass distributions) from stellar kinematic data. We are starting by building very simple potentials (like literally simple one-d potentials like the simple harmonic oscillator) and very simple distribution functions (like isothermal) and seeing how the different methods (virial, Jeans, Schwarzschild, and torus imaging) work. The medium-term goal is to figure out how these methods are sensitive to their assumptions, and robust to data that violate those assumptions. Also some information theory!


how much calibration data does a spectrograph need?

There was an SDSS-V operational telecon today, in which we discussed the plans for the first year of data, and how those plans should depend on, or be conditional on, what we learn in the commissioning phase. One of the most important things about SDSS-V is that it is robotic and fiber-fed, so we can move fast and do time domain things. But how fast? This depends, in turn, on how much calibration data we need as a function of time. I proposed that we could ask how well we can synthesize calibration data in one telescope configuration at one time, given calibration data from other telescope configurations at other times. This is not unlike the approach we took to calibrating EXPRES. So of course it was put back to me: Design the experiments that will answer this question! The main issue is that the BOSS spectrogaphs hang off the back of the pointing, tracking telescope.


optimal sizes of sails?

I spent some time in-between vacationing to look at the relative sizes of sails and keels on sailboats (yes, sailboats). I find that a boat sailing cross-wind sails fastest when the ram-pressure force prefactor (effective area times density) of the sail and the keel are comparable. That is, you want the effective area of the sail to be something like 800 times larger than the effective area of the keel! Strange, but maybe not false for the fastest competition sailboats?


nothing really

Just some writing in my physics-of-sailing project, and getting ready for a short vacation (gasp!).


PCA and convex optimization

Soledad Villar (JHU) and I talked about bi-linear problems today, in the context of instrument calibration and computer vision. We looked at the kinds of optimizations that are involved in these problems. She showed me that, if you think of PCA as delivering a projection operator (that is, not the specific eigenvectors and amplitudes, but just the projection operator you would construct from those eigenvectors), that projection operator can be derived from a convex optimization in which the objective is purely mean squared error. That was news to me!


physics-of-sailing literature

I sucked it up and read a bunch of the physics-of-sailing literature today (and on the weekend). Some of the books very correctly attribute the forces on sails and wings to momentum transport. Some of the books very incorrectly attribute them to differences of pressure calculable from Bernoulli effect alone. But in reading it all, I did come to the conclusion that no-one is working in precisely the space we want to work, so I do think there is a (correctly scoped) paper to write. Of course even if there weren't, I couldn't stop myself!


statistics translation project?

I had a wide-ranging conversation today with former NYU undergraduate Hilary Gao. One thing we discussed is the idea that physicists (and biologists, chemists, and so on) know a lot about statistics and evidence, and yet often find it hard to understand social-science research (like the epidemiology around Coronavirus and the data around race and policing, for two contemporary examples). This is (in my opinion) partly because the social-science literature involves aspects of causal inference that are fundamental, but don't appear in the same form in the natural sciences. We discussed what it would take to usefully write about or intervene into this quasi-translation project.


the geometry of gaps in point clouds

Gaby Contardo (Flatiron) and I had our weekly today on our geometric data-analysis methods for finding and characterizing gaps or valleys in point clouds. We are starting at 2D data, which is the simplest case, and it is interesting and we have nice results. But scaling up to more dimensions is hard. For one, there is the curse of dimensionality, such that anything that relies on, or approximates, density estimation gets hard fast. And for another, the kinds of geometric structures or options for gaps blows up combinatorically (or faster than linearly anyway) with the number of dimensions. Do we have to enumerate the possibilities and track them all? Or are there more clever things? We don't yet have answers, even for 3D, let alone 6D!


extracting spectra using models or filters

I had an interesting conversation with Matt Nixon (Cambridge) today, about new ways one might extract spectra of stars if the true goal is exoplanet atmosphere transmission spectra. Right now most methods bin down the data until there is enough signal-to-noise in every bin/super-pixel to extract a difference spectrum in every pixel. But there are other ways to bin down, which could be guided by the theoretical spectral models. In the most extreme case, you would do the atmospheric retrieval (as they call it in the business) in the un-extracted 2D image data from the spectrograph! We discussed what pieces of technology would be needed to make this happen. I love this idea, because it is at the intersection of sophisticated hardware (these spectrographs are amazing) and sophisticated theory (the exoplanet atmospheres codes are non-trivial).


the baryon-induced displacement field

Today Kate Storey-Fisher (NYU) showed me very nice visualizations of two matched simulations, one dark matter only, and one dark matter plus baryons. The simulations are matched in the sense that they have identical initial conditions, the only difference is that the latter simulation has baryon physics, such as cooling, star formation (approximately), AGN feedback (approximately), and so on. The simulations are from the IllustrisTNG project.

The thing that is interesting to us is whether we can model the differences between the simulations, and in particular whether building such a model will lead to insights about the fundamental physical mechanisms that lead to the differences. Large-scale gravity, after all, doesn't care about the small-scale composition or state of the matter; it only cares about the mass, so why are these simulations different at all? Of course there are lots of things about baryon physics that move matter, so it isn't a paradox, it's just interestingly non-trivial.

Now the question is: If we throw some gauge-invariant machine learning at this problem, will it lead to new insights about physical cosmology? That would be a real win.


SDSS-V MWM target selection

Today Jennifer Johnson (OSU) crashed the weekly Manhattan-area SDSS-V discussion meeting to give us the current state of Milky Way Mapper (a component of SDSS-V) target selection. It was a great discussion, because there are many, many target categories, and many of them are interesting to Manhattan-area locals. For me the most impressive thing about the meeting was that Johnson could answer almost any question from anyone on any of the literally dozens of target categories! It was a tour de force as they say. And we learned a lot. One of my goals with this meeting (which was started and is operated by Katie Breivik, Flatiron) is to increase excitement in Manhattan for SDSS-V and Johnson did that admirably.


review of methods for EPRV

Lily Zhao (Yale) walked Megan Bedell (Flatiron) and me through her summary or review of all the methods that have been submitted to her stellar activity challenge for extreme precision radial velocity spectroscopy. She has made the (very good) decision to organize the methods by the assumptions they make rather than the tools they use. For instance, several use principal components analysis, and several others use Gaussian processes. But if they use them in different places, they are effectively making different assumptions. But of course it isn't easy to take someone else's method and decide what assumptions it is making! So this review of all methods—which started out as just a small, necessary part of her paper about the challenge—is in the end one of the big intellectual achievements of this project. I'm excited about it. I'm rarely this excited about a paper that will have dozens and dozens of authors (although I guess I would make exceptions for this paper and this paper)!


k near n?

Teresa Huang (JHU) has a nice paper (with Villar and me) that shows the risk and regularization of linear regression involving PCA. We discussed it more today, in particular whether we can say more about the regime in which the PCA dimensionality reduction (to k dimensions) doesn't do much (because k is close to the number of data points n). We think we can, because the Marchenko-Pastur distribution of eigenvalues is so skew: Cutting off even one small eigenvalue (k=n-1) can be useful!


issues with SDSS-IV APOGEE data

I'm proud of my new undergraduate researcher Katherine Alsfelder (NYU). She has been working with me to understand the differences between the North (Apache Point Observatory) and South (Las Campanas Observatory) spectrographs used in the APOGEE-2 survey in SDSS-IV. We chose one fiber in the North spectrograph to compare with one fiber in the South spectrograph (just to get started). She noticed some discrepancies in the data model: Some of the data have inconsistent telescope IDs in different files. We sent an example we couldn't figure out to Mike Blanton (NYU) and he pointed out that a star at a Dec of -70 deg can't be observed with the Apache Point Observatory 2.5m! Haha well I guess that one was easy. But anyway, I'm happy because it shows the value of being careful in the reading and vetting of data, especially housekeeping data. And we developed a work-around.


how to simulate a spectrum

I had a great conversation today with Matt Daunt (NYU), building on discussion yesterday with also Megan Bedell (Flatiron), about how to simulate data from an extreme-precision radial-velocity spectrograph. We decided to simulate the star, the atmosphere, and the (gasp!) gas cell all at very high resolution, then combine them physically, then reduce resolution to the spectrograph resolution (which is very high nonetheless) and then sample and noisify the resulting data. The idea is: Make the structure of the code like the structure of our physical beliefs, or causal beliefs. We decided to fork this data simulation into its own project.


SDSS-V meeting

Katie Breivik (Flatiron) has started a meeting in New York for those interested in SDSS-V data and science. This has been fun; I have learned about a lot of different projects that I didn't know about. In today's meeting, Adrian Price-Whelan (Flatiron) showed some plots of the distribution of different abundances in the Milky Way disk, showing that we can probably see the Galaxy mid-plane way better in the abundances than in the kinematics. And kinematic evolution aligns the abundances with the kinematics! Nice result there. We vowed to have an expert come and walk us through SDSS-V target selection soon, since we were all soft on what, exactly, we would target!


more sailing

I spent the weekend in an undisclosed location working on my ram-pressure model for a sailboat. I realized that there are multiple models, even if you decide that it will be ram pressure! I coded up multiple models, and also worked on writing text. I made figures like this one!


talking about equivariant functions

I gave one of the internal/informal CCA seminars today. I spoke about our recent work on equivariant functions. I gave a pretty non-mathematical description of it, concentrating on things like Einstein summation notation and the symmetries of physical law, and the like. Afterwards, Ken Van Tilburg (NYU) commented that our result is so very simple that it must be known. I agree! But we couldn't find it in the literature anywhere clearly.


coarse-graining a point cloud with a kd-tree?

As my loyal reader knows, I am interested in fast-multipole method and whether it could be used to improve or speed machine-learning methods on graphs or spatial point clouds. Over the last months, I have learned about lots of limitations of FMMs, some of which we discuss here. I'm still interested! But when I last spoke with Leslie Greengard (Flatiron) he indicated that he felt like if you want to take FMMs scale up to very clustered data in high dimensions, maybe you have to think of truly adaptive trees (not the fixed tree of an FMM), like perhaps kd-trees. Today Soledad villar (JHU) and I discussed this idea. The question is: What could be proved about such an approach, or are there such approaches where you could get things like accuracy guarantees? The FMM has the beautiful property that you can compute the precision of your approximation, and dial up the order to get better precision.


abundance calibration and abundance gradients

Today Christina Eilers (MIT) updated Hans-Walter Rix (MPIA) and me on our project to self-calibrate the element-abundance measurements in APOGEE. We are looking at self-consistency of the abundance distribution as a function of actions; in a well-mixed Galaxy this could be used to calibrate the biases of the abundance measurements with surface gravity (a known effect in the data) and spectral resolution (a possible effect). Eilers has beautiful results: The abundances get better and the abundance gradients in the Galaxy (with radius or azimuthal action, and with vertical height or vertical action) become more clear and more sensible. So we have a paper to write!


machine-learning group meeting

Today Soledad Villar (JHU), Kate Storey-Fisher (NYU), Weichi Yao (NYU), and I crashed the machine-learning group meeting hosted by Shirley Ho (Flatiron) and Gaby Contardo (Flatiron). Villar presented our new paper on gauge-invariant functions and we started the conversation about what to do with it. We vowed to come back to the meeting to discuss that: What are the best applications of machine learning in cosmology and astrophysics right now?


a model for sailing (yes, sailing)

I've had a lifetime of conversations with Hans-Walter Rix (MPIA) about the point that you could in principle sail with a sailboat with flat sails: Nothing about the curvature of the sails is integral or required by sailing. The curvature helps, but isn't necessary. I have had another lifetime of conversations with Matt Kleban (NYU) about the point that sailing depends on the relative velocity between the air and the water, and this leads to some hilarious physics problems involving sailing on rivers in zero wind (it's possible because a flowing river is moving relative to the dead air).

These worlds collided this weekend because—inspired by a twitter conversation—I finally built a proper ram-pressure model of a flat-sail, flat-keel sailboat and got it all working. It's sweet! It sails beautifully. Much more to say, but question is: Is there a paper to write?


counting repeat spectra in APOGEE

I worked today with Katherine Alsfelder (NYU) to develop statistics on APOGEE spectra: There are two spectrographs (one in the North and one in the South) and there are 300 fibers per spectrograph. How many stars have been observed in each of the 600 different options, and how many of the 600-choose-2 options have seen the same star? This all in preparation for empirical cross-calibration of the spectrographs. There is a lot of data! But 600-choose-2 is a huge number.


information theory at Cambridge

Today I gave a colloquium at the University of Cambridge. My slides are here. I spoke about how to make precise measurements, how to design surveys, and how to exploit structure in noise. It's a rich set of things, and most of the writing about information theory in astronomy is only in the cosmology domain. Time to change that, maybe? It is also the case that the best book about information and inference ever written was written in Cambridge! So I was bringing coals to Newcastle, ish!


machine learning at #AAS238

Today I spoke at the “meeting-in-meeting” on machine learning at the summer AAS meeting. My slides are here. I started out a bit negative but I ended up saying very positive things about what machine learning can do for astrophysics. I got as much feedback on the twitters afterwards (maybe more) than I did in real time. Several of the other speakers in my session mentioned or discussed contrastive learning, which looks like it might be an interesting unsupervised technique.


making slides for AAS and Cambridge

I'm giving two talks this week, one at #AAS238 and one at the University of Cambridge. Because I am a masochist (?) I put in titles and abstracts for both talks that are totally unlike those for any talks I have given previously. So I have to make slides entirely from scratch! I spent every bit of time today not in meetings working on slides. I'm not at all ready!


vectors, bras, and kets

One of my PhD advisors—my official advisor—was Roger Blandford (now at Stanford). Blandford, being old-school, responded to a tweet thread I started by sending me email. I am trying to move over to always describing tensors and rotation operators and Lorentz transformations and the like in terms of unit vectors, and I realized that the most enlightened community along these lines are the quantum mechanics. Probably because they work in infinite-dimensional spaces often! Anyway, there are deep connections between vectors in a space and functions in a Hilbert space. I'm still learning; I think I will never fully get it.


objective functions and Nyquist sampling

Adrian Price-Whelan and I discussed today some oddities that Matt Daunt (NYU) is finding while trying to measure radial velocities in extremely noisy, fast APOGEE sub-exposures. He finds that the objective function we are using is not obviously smooth on 10-ish km/s velocity scales. Why not? We don't know. But what we do know is that a spectrograph with resolution 22,500 cannot put sharp structures into a likelihood function on scales smaller than about 13 km/s.

There's a nice paradox here, in fact: The spectrograph can't see features on scales smaller than 13 km/s, and yet we can reliably measure radial velocities much better than this! How? The informal answer is that the radial-velocity precision is 13 km/s divided by a certain, particular signal-to-noise. The formal answer involves information theory—the Fisher information, to be precise.


Dr Lily Zhao

I had the great honor to be on the PhD committee of Lily Zhao (Yale), who defended her dissertation today. It was great and remarkable. She has worked on hardware, calibration, software, stellar astrophysics, and planets. Her seminar was wide-ranging, and the number and scope of the questions she fielded was legion. She has already had a big impact on extreme precision radial-velocity projects, and she is poised to have even more impact in the future. One of the underlying ideas of her work is that EPRV projects are integrated hardware–software systems. This idea should inform everything we do, going forward. I asked a million technical questions, but I also asked questions about the search for life, and the astronomical community's management and interoperation of its large supply of diverse spectrographs. In typical Zhao fashion, she had interesting things to say about all these things.


orthogonalization in SR, continued

Soledad Villar (JHU) and I discussed more the problem of orthogonalization of vectors—or finding orthonormal basis vectors that span a subspace—in special (and general) relativity. She proposed a set of hacks that correct the generalization of Gram–Schmidt orthogonalization that I proposed a week or so ago. It's complicated, because although the straightforward generalization of GS works with probability one, there are cases you can construct that bork completely. The problem is that the method involves division by an inner product, and if the vector becomes light-like, that inner product vanishes.



friday: NeurIPS submission

In a heroic final push, Soledad Villar (JHU) finished our paper for NeurIPS submission today. We showed that you can make gauge-invariant neural networks without using the irreducible representations of group theory, or any other heavy computational machinery, at least for large classes of groups. Indeed, for all the groups that appear in classical physics (orthogonal group, rotations, Euclidean, Lorentz, Poincaré, permutation). Our contribution is pure math! It is only about machine learning inasmuch as it suggests future methods and simplifications. We will post it to arXiv next week.



My only resarch today was conversations with Gaby Contardo about the scope and experiments of our paper on methods to automatically discover and characterize gaps in point-cloud data.


what is permutation symmetry?

I spent a lot of time today trying to write down, very specifically, what it means for a function to be invariant with respect to permutation of its input arguments. It turns out that this is hard! Especially when the function is a vector function of vector inputs. This is all related to our nascent NeurIPS submission. This symmetry, by the way, is the symmetry enforced by graph neural networks. But it is also a symmetry of all of classical physics (if, say, the vectors are the properties of particles).


astrology: Yes, it's true

Today Paula Seraphim (NYU) and I extended our off-kilter research on the possibility that we live in a simulation to off-kilter research on whether astrology has some basis in empirical fact. It does! There are birth-season correlations with many things. The issue with astrology, oddly, is not the data! It is with the theory that it is all related to planets and constellations. And if you think about the causes of birth-season effects on personality and capability, most of them (but not all of them) would have been much stronger 2000 years ago than they are today!


linear subspaces in special relativity

Who knew that my love of special relativity would collide with my love of data analysis? In the ongoing conversation between Soledad Villar (JHU), Ben Blum-Smith (NYU), and myself about writing down universally approximating functions that are equivariant with respect to fundamental physics symmetries, a problem came up related to the orientation of sets of vectors: In what groups are there possible actions on d-dimensional vectors such that you can leave all but one of the d vectors unchanged, and change only the dth? It turns out that this is an important question. For example, in 3-space, the orthogonal group O(3) permits this but the rotation group SO(3) does not! This weekend, I showed that the Lorentz group permits this. I showed it constructively.

If you care, my notes are here. It helped me understand some things about the distinction between covariant and contravariant vectors. This project has been fun, because I have used this data-analysis project to learn some new physics, and my physics knowledge to inform a data analysis framework.


gaps: it works!

In our early meeting today, Gaby Contardo (Flatiron) showed me results from the various tweaks and adjustments she has been making to her method for finding gaps (valleys, holes, lacunae) in point-cloud data. When she applies it to the local velocity distribution in the Milky Way disk, it finds all the gaps we see there and traces them nicely. We have a paper to write! Her method involves critical points and multiple derivatives and a stately kind of gradient descent. It's sweet! We have to work on figuring out how to generalize to arbitrary numbers of dimensions.


field theories require high-order tensors

I started a blow-up on twitter about electromagnetism and pseudo-vectors. Why do we need to invoke the pseudo-vector magnetic field when we start with real vectors and end with real vectors? This is all related to my project with Soledad Villar (JHU) and Ben Blum-Smith (NYU) about universally approximating functions (machine learning) for physics. Kate Storey-Fisher (NYU) converted an electromagnetic expression (for that paper/project) that contains cross products and B field into one that requires no cross products and no B field. So why do we need the B field again?

I figured out the answer today: If we want electromagnetism to be a field theory in which charges create or propagate a field and a test particle obtains an electromagnetic force by interaction with that field, then the field has to be an order-2 tensor or contain a pseudo-vector. That is, you need tensor objects to encode the configuration and motions of the distant charges. If you don't need your theory to be a field theory, you can get away without the high-order or pseudo- objects. This should probably be on my teaching blog, not here!


gaps: clever and non-clever methods

Gaby Contardo (Flatiron) showed me beautiful plots that indicate that we can trace the valleys and gaps in a point set using the geometry and calculus things we've been exploring. But then she pointed out that maybe we could find the same features just by taking a ratio of two density estimates with different frequency bandpasses (bandwidths)! Hahaha I hope that isn't true, because we have spent time on this! Of course it isn't lost time, we have learned s lot.


bad inputs on the RHS of a regression are bad

I discussed new results with Christina Eilers (MIT), who is trying to build a simple, quasi-linear causal model of the abundances of stars in our Galaxy. The idea is that some abundance trends are being set or adjusted by problems with the data, and we want to correct for that by a kind of self-calibration. It's all very clever (in my not-so-humble opinion). Today she showed that her results get much better (in terms of interpretability) if she trims out stars that get assigned very wrong dynamical actions by our action-computing code (thank you to Adrian Price-Whelan!). Distant, noisy stars can get bad actions because noise draws on distance can make them effectively look unbound! And in general, the action-estimation code has to make some wrong assumptions.

It's a teachable moment, however, because when you are doing a discriminative regression (predicting labels using features), you can't (easily) incorporate a non-trivial noise model in your feature space. In this case, it is safer to drop bad or noisy features than to use them. The labels are a different matter: You can (and we do) use noisy labels! This asymmetry is not good of course, but pragmatic data analysis suggests that—for now—we should just drop from the training set the stars with overly noisy features and proceed.


regression as a tool for extreme-precision radial velocity

Lily Zhao (Yale) showed new regression results to Megan Bedell (Flatiron) and me today. She's asking whether shape properties of a stellar spectrum give you any information about the radial velocity of the star, beyond the Doppler shift. The reason there might be some signatures is that (for example) star spots and pulsations can distort radial-velocity measurements (at the m/s level) and they also (very slightly) change the shape of the stellar spectrum (line ratios and line shapes and so on). She has approaches that are \emph{discriminative}—they try to predict the RV from the spectrum—and approaches that are \emph{generative}—they try to predict the spectrum from the RV and other housekeeping data. Right now the discriminative approaches seem to be winning, and they seem to be delivering a substantial amount of RV information. If this is successful, it will be the culmination of a lot of hard work.


re-parameterizing Kepler orbits

As many exoplaneteers know, parameterizing eccentric gravitational two-body orbits (ellipses or Kepler orbits) for inferences (MCMC sampling or, alternatively, likelihood optimizations) is not trivial. One non-triviality is that there are combinations of parameters that are very-nearly degenerate for certain kinds of observations. Another is that when the eccentricity gets near zero (as it does for many real systems), some of the orientation parameters become unconstrained (or unidentifiable or really non-existent). Today Adrian Price-Whelan (Flatiron) was hacking on this with the thought that the time or phase of maximum radial velocity (with respect to the observer) and the time or phase of minimum radial velocity could be used as a pair of parameters that give stable, well-defined combinations of phase, eccentricity, and ellipse orientation (when that exists). We spent an inordinate amount of time in the company of trig identities.


streams in external galaxies

Sarah Pearson (NYU), Adrian Price-Whelan (Flatiron), and I met today to discuss fitting tidal streams (and especially cold stellar streams) discovered around external galaxies. We are starting with the concept of the stream track, and therefore we need to turn imaging we have of external galaxies into some description of the stream track in some coordinate system that makes sense. We spent time discussing that. We're going to start with some hacks. This isn't unrelated to the work I have been discussing on microscopy of robots: We want to make very precise measurements, but in very heterogeneous, complex imaging.


group theory and the laws of physics

I was very fortunate to be part of a meeting today between Soledad Villar (JHU) and Benjamin Blum-Smith (NYU) in which we spoke about the possible forms of universally approximating functions that are equivariant under rotations and reflections (the orthogonal group O(d)). From my perspective, this is a physics question: What are all the possible physical laws in classical physics? From Villar's perspective, this is a machine-learning question: How can we build universal or expressive machine-learning methods for the natural sciences? From Blum-Smith's perspective, this is a group-theory question: What are the properties of the orthogonal group and its neighbors? We discussed the possibility that we might be able to write an interdisciplinary paper on this subject.


azimuthal variations in the velocity distribution

In my weekly with Jason Hunt (Flatiron), we discussed the point that if the gaps in the local velocity distribution in the Milky Way are caused by the bar (or some of them are) and if the gaps are purely phase shifts and not things being thrown out by chaos, then the velocity distribution in a localized patch should vary with azimuth as an m=2 pattern, with maybe some m=4 and m=6 mixed in. So, that made us think we should look at simulations, and see if there are any features in the local velocity distribution that might be interpretable in such a model. For instance, could we measure the angle of the bar?


Dr Marco Stein Muzio

Today Marco Stein Muzio (NYU) defended his PhD dissertation on multi-messenger cosmic-ray astrophysics (and cosmic-ray physics). He gave credible arguments that the combination of hadron, neutrino, muon, and photon data imply the existence of new kinds of sources contributing at the very highest energies. He made predictions for new data. We (the audience) asked him about constraining hadronic physics, and searching for new physics. He argued that the best place to look for new physics is in the muon signals: They don't seem to fit with the rest. But overall, if I had to summarize, he was more optimistic that the data would be all explained by astrophysical sources and hadronic physics, and not require modifications to the particle model. It was an impressive and lively defense. And Dr MSM has had a huge impact on my Department, co-founding a graduate-student political group and helping us work through issues of race, representation, and culture. I can't thank him enough, and I will miss his presence.


Dr Sicheng Lin

Today Sicheng Lin (NYU) defended his PhD dissertation on the connections between galaxies and the dark-matter field in which they live. He worked on elaborations of what's known as “halo occupation”, “abundance matching” and the like. At the end, I asked my standard questions about how the halo occupation fits into ideas we have about gravity, and the symmetries of physical law. After all, “haloes” aren't things that exist in the theory of gravity. And yet, the model is amazingly successful at explaining large-scale structure data, even down to tiny details. That led to a very nice and very illuminating discussion of all the things that could matter to galaxy clustering and dark-matter over-densities, including especially time-scales. An important dissertation in an important area: I learned during the defense that the DESI project has taken more than one million spectra in it's “science verification” phase. Hahaha! It makes all my work from 1994 to 2006-ish seem so inefficient!


microscopy of fiber robots

Conor Sayres (UW) and I continued today our discussion of the data-analysis challenges associated with the SDSS-V focal-plane system (the fiber robots). Today we discussed the microscopy of the robots. Sayres has images (like literally RGB JPEG images) from a setup in which each fiber robot is placed into a microscope. From this imaging, we have to locate the three fibers (one for the BOSS spectrograph, one for the APOGEE spectrograph, and one for a back-illumination system used for metrology, all relative to the outer envelope of the robot arm. And do this for 300 or 600 robots. The fibers appear as bright, resolved circles in the imaging, but on a background that has lots of other detail, shading, and variable lighting. This problem is one that comes up a lot in astrophysics: You want to measure something very specific, but in a non-trivial image, filled with other kinds of sources and noise. We discussed options related to matched filtering, but we sure didn't finish.


Stanford talk; heretics

I spoke today at Stanford, about the ESA Gaia Mission and it's promise for mapping (and, eventually, understanding) the dark matter in the Milky Way. I spoke about virial and Jeans methods, and then methods that permit us to image the orbits, like streams, the Snail, and orbital torus imaging. At the end of the talk Roger Blandford (Stanford) asked me about heretical ideas in gravity and dark matter. I said that there hasn't been a huge amount of work yet from the Gaia community testing alternative theories of gravity, but there could be, and the data are public. I also said that it is important to do such work, because gravity is the bedrock theory of astrophysics (and physics, in some sense). ESA Gaia potentially might deliver the best constraints in some large range of scales.


the orthogonal group is very simple

Soledad Villar (JHU) and I have been kicking around ideas for machine learning methods that are tailored to classical (mechanical and electromagnetic) physical systems. The question is: What is the simplest representation of objects in this theory that permits highly expressive machine-learning methods but constrained to obey fundamental symmetries, like translation, rotation, reflection, and boost. Since almost all (maybe exactly all) of classical physics obeys rotation and reflection, one of the relevant groups is the orthogonal group O(3) (or O(d) in general). This group turns out to be extremely simple (and extremely constrained). We might be able to make extremely expressive machines with very simple internals, if we have this group deliver the main symmetry or equivariance. We played around with possible abstracts or scopes for a paper. Yes, a purely theoretical paper for machine learning. That puts me out of my comfort zone! We also read some group theory, which I (hate to admit that I) find very confusing.


open research across the University

Scott Collard of NYU Libraries organized an interdisciplinary panel across all of NYU today to discuss open research. I often talk about “open science”, but this discussion was explicitly to cover the humanities as well. We talked about the different cultures in different fields, and the roles of funding agencies, universities, member societies, journals, and so on. One idea that I liked from the conversation was that participants should try to ask what they can do from their position and not try to ask what other people should do from theirs. We had recommendations for making modifications to NYU promotion and tenure, putting open-research considerations into annual merit review, and asking the Departments to think about how, in their field, they could move to the most open edge of what's acceptable and conventional. Another great idea is that open research is directly connected to ideas of inclusion and equity, especially when research is viewed globally. That's important.


adversarial attacks and robustness

Today Teresa Huang (JHU) re-started our conversations about adversarial attacks against popular machine-learning methods in astrophysics. We started this project (ages ago, now) thinking about test-time attacks: You have a trained model, how does it fail you at test time? But since then, we have learned a huge amount about training-time attacks: If you add a tiny change to your training data, can you make a huge change to your model? I think some machine-learning methods popular in astronomy are going to be very susceptible to both kinds of attacks!

When we discussed these ideas in the before times, one of the objections was that adversarial attacks are artificial and meaningless. I don't agree: If a model can be easily attacked, it is not robust. If you get a strange and interesting result in a scientific investigation when you are using such a model, how do you know you didn't just get accidentally pwned by your noise draw? Since—in the natural sciences—we are trying to learn how the world works, we can't be putting in model components or pipeline components that are capable of leading us very seriously astray.


how accurately can we do closed-loop robot control?

Conor Sayres (UW) and I spoke again today about the fiber positioning system (fiber robots) that lives in the two focal planes of the two telescopes that take data for SDSS-V. One of the many things we talked about is how precisely do we need to position the fibers, and how accurately will we be able to observe their positions in real time. It's interestingly marginal; the accuracy with which the focal-plane viewing system (literally a camera in the telescope that looks the wrong way) will be able to locate the fiber positions depends on details that we don't yet know about the camera, the optics, the fiducial-fiber illumination system, and so on. There are different kinds of sensible procedures for observatory operations that depend very strongly on the accuracy of the focal-viewing system.


vectors and scalars

If you have a set of vectors, what are all the scalar functions you can make from those vectors? That is a question that Soledad Villar (JHU) and I have been working on for a few days now. Our requirements are that the scalar be rotationally invariant. That is, the scalar function must not change as you rotate the coordinate system. Today Villar proved a conjecture we had, which is that any scalar function of the vectors that is rotationally invariant can only depend on scalar products (dot products) of the vectors. That is, you can replace the vectors with all the dot products and that is just as expressive.

After that proof, we argued about vector functions of a set of vectors. Here it turns out that there are a lot more options if you want your answer to be equivariant (not invariant but equivariant) to rotations than if you wnt your answer to be equivariant to rotations and parity swaps. We still don't know what our options are, but because it's so restrictive, I think parity is a good symmetry to include.


the theory is a grid

I had a great conversation with Andy Casey (Monash) at the end of the day. We discussed many things related to APOGEE and SDSS-V. One of the things I need is the code that makes the synthetic (physical model) spectra for the purposes of obtaining parameter estimates in APOGEE and the derivatives of that model with respect to stellar parameters. That is, I want the physical-model derivatives of spectral expectation with respect to parameters (like temperature, surface gravity, and composition). It turns out that, at this point, the model is a set of synthetic spectra generated on a grid in parameter space! So the model is the grid, and the derivatives are the slopes of a cubic-spline interpolation (or something like that). I have various issues with this, but I'll be fine.


Dr Shengqi Yang

I've had the pleasure of serving on the PhD committee of Shengqi Yang (NYU) who defended her PhD today. She worked on a range of topics in cosmological intensity mapping, with a concentration on the aspects of galaxy evolution and galaxy formation that are important to understand in connecting the intensity signal to the cosmological signal. But her thesis was amazingly broad, including theoretical topics and making observational measurements, and also ranging from galaxy evolution to tests of gravity. Great stuff, and a well-earned PhD.


Dr Jason Cao

Today Jason Cao (NYU) defended his PhD on the galaxy–halo connection in cosmology. He has built a stochastic version of subhalo abundance matching that has a stochastic component, so he can tune the information content in the galaxies about their host halos. This freedom in the halo occupation permits the model to match more observations, and it is sensible. He also explored a bit the properties of the dark-matter halos that might control halo occupation, but he did so observationally, using satellite occupation as a tracer of halo properties. These questions are all still open, but he did a lot of good work towards improving the connection between the dark sector and the observed galaxy populations. Congratulations, Dr Cao; welcome to the community of scholars!


finding the fiber robots in the SDSS-V focal planes

At the end of the day I met with Conor Sayres (UW) to discuss the problem of measuring the position of focal-plane fiber-carrying robots given images from in-telescope cameras (focal viewing cameras) inside the telescopes that are operating the SDSS-V Project. We have not installed the fiber robots yet, but Sayres has a software mock-up of what the focal viewing camera will see and all its optics. We also discussed some of the issues we will encounter in commissioning and operation of this viewing system.

Later, in the night, I worked on data-driven transformations between focal-plane position (in mm) in the telescope focal plane and position in the focal viewing camera detector plane (in pixels). I followed the precepts and terminology described in this paper on interpolation-like problems. My conclusion (which agrees with Sayres's) is that if these simulations are realistic, the fitting will work well, and we will indeed know pretty precisely what all the fiber robots are doing.


problems for gauge-invariant GNNs

Today Kate Storey-Fisher (NYU) and I spent more time working with Weichi Yao (NYU) and Soledad Villar (JHU) on creating a good, compact, but real test problem for gauge-invariant graph neural networks. We discussed a truly placeholder toy example in which we ask the network to figure out the identity of the most-gravitationally-bound point in a patch of a simulation. And we discussed a more real problem of inferring things about the occupation or locations of galaxies within the dark-matter field. Tomorrow Storey-Fisher and I will look at the IllustrisTNG simulations, which she has started to dissect into possible patches for Yao's model.


unwinding a spiral

A lot of conversations in the Dynamics group at Flatiron recently have been about spirals: Spirals in phase space, spirals in the disk, even spirals in the halo. In general, as a perturbed dynamical system (like a galaxy or a star cluster) evolves towards steady-state, it goes through a (or more than one) spiral phase. We've (collectively) had an interest in unwinding these spirals, to infer the initial conditions or meta-data about the events that caused the disequilibrium and spiral-winding. Jason Hunt (Flatiron) discussed these problems with Adrian Price-Whelan (Flatiron) and me today, showing some attempts to unwind (what I call) The Snail. That led to a long conversation about what would make a good “loss function” for unwinding. If something was unwinding well, how would we know? That led to some deep conversations.


geometric data analysis

I got some real hacking time in this afternoon with Gaby Contardo (Flatiron). We worked through some of the code issues and some of the conceptual issues behind our methods for finding gaps in point clouds using (what I call) geometric data analysis, in which we find critical points (saddles, minima, maxima) and trace their connections to map out valleys and ridges. We worked out a set of procedures (and tested some of them) to find critical points, join them up with constrained gradient descents, and label the pathways with local meta-data that indicate how “gappy” they are.


extremely precise spectrophotometric distances

Adrian Price-Whelan is building a next-generation spectrophotometric distance estimation method that builds on things that Eilers, Rix, and I did many moons ago. Price-Whelan's method splits the stars up in spectrophotometric space and builds local models for different kinds of stars. But within those local patches, it is very similar to what we've done before, just adding some (very much) improved regularization and a (very much) improved training set. And now it looks like we might be at the few-percent level in terms of distance precision! If we are, then the entire red-giant branch might be just as good for standard-candlyness as the red clump. This could really have a big impact on SDSS-V. We spent part of the day making decisions about spectrophotometric neighborhoods and other methodological hyper-parameters.


how to calibrate fiber robots?

Today we had the second in a series of telecons to discuss how we get, confirm, adjust, and maintain the mapping, in the SDSS-V focal planes (yes there are two!) between the commands we give to the fiber-carrying robots and the positions of the target stellar images. It's a hard problem! As my loyal reader might imagine, I am partial to methods that are fully data-driven, and fully on-sky, but their practicality depends on a lot of prior assumptions we need to make about the variability and flexibility of the system. One thing we sort-of decided is that it would be good to get together a worst-case-scenario plan for the possibility that we install these monsters and we can't find light down the fibers.


re-scoping our gauge-invariant GNN project

I am in a project with Weichi Yao (NYU) and Soledad Villar (NYU) to look at building machine-learning methods that are constrained by the same symmetries as Newtonian mechanics: Rotation, translation, Galilean boost, and particle exchange, for examples. Kate Storey-Fisher (NYU) joined our weekly call today, because she has ideas about toy problems we could use to demonstrate the value of encoding these symmetries. She steered us towards things in the area of “halo occupation”, or the question of which dark-matter halos contain what kinds of galaxies. Right now halo occupation is performed with very blunt tools, and maybe a sharp tool could do better? We would have the advantage (over others) that anything we found would, by construction, obey the fundamental symmetries of physical law.


domain adaptation and instrument calibration

At the end of the day I had a wide-ranging conversation with Andy Casey (Monash) about all things spectroscopic. I mentioned to him my new interest in domain adaptation, and whether it could be used to build data-driven models. The SDSS-V project has two spectrographs, at two different telescopes, each of which observes stars down different fibers (which have their own idiosyncracies). Could we build a data-driven model to see what any star observed down one fiber of one spectrograph would look like if it had been observed down any other fiber or any fiber of the other spectrograph? That would permit us to see what systematics are spectrograph-specific, and whether we would have got the same answers with the other spectrograph, and other questions like that.

There are some stars observed multiple times and by both observatories, but I'm kind-of interested in whether we could do better using the huge number of stars that haven't been observed twice instead. Indeed, it isn't clear which contains more information about the transformations. Another fun thing: The northern sky and the southern sky are different! We would have to re-build domain adaptation to be sensitive to those differences, which might get into causal-inference territory.


The Practice of Astrophysics (tm)

Over the last few weeks—and the last few decades—I have had many conversations about all the things that are way more important to being a successful astrophysicist than facility with electromagnetism and quantum mechanics: There's writing, and mentoring, and project design, and reading, and visualization, and so on. Today I fantasized about a (very long) book entitled The Practice of Astrophysics that covers all of these things.


best setting of hyper-parameters

Adrian Price-Whelan (Flatiron) and I encountered an interesting conceptual point today in our distance estimation project: When you are doing cross-validation to set your hyper-parameters (a regularization strength in this case), what do you use as your validation scalar? That is, what are you optimizing? We started by naively optimizing the cost function, which is something like a weighted L2 of the residual and an L2 of the parameters. But then we switched from the cost function to just the data part (not the regularization part) of the cost function, and everything changed! The point is duh, actually, when you think about it from a Bayesian perspective: You want to improve the likelihood not the posterior pdf. That's another nice point for my non-existent paper on the difference between a likelihood and a posterior pdf. It also shows that, in general, the data and the regularization will be at odds.


strange binary star system; orbitize!

Sarah Blunt (Caltech) crashed Stars & Exoplanets Meeting today. She told us about her ambitious, community-built orbitize project, and also results on a mysterious binary-star system, HD 104304. This is a directly-imaged binary, but when they took radial-velocity measurements, the mass of the primary is way too high for its color and luminosity. The beauty of orbitize is that it can take heterogeneous data, and it uses brute-force importance sampling (like my one true love The Joker), so she can deal with very non-trivial likelihood functions and low signal-to-noise, sparse data.

The crowd had many reactions, one of which is that probably the main issue is that ESA Gaia is giving a wrong parallax. That's a boring explanation, but it opens a nice question of using the data to infer or predict a distance, which is old-school fundamental astronomy.


causal-inference issues

I had a nice meeting (in person, gasp!) with Alberto Bolatto (Maryland) about his beautiful results in the EDGE-CALIFA survey of galaxies, and (yes) patches of galaxies. Because they have an IFU, they can look at relationships between gas, dust, composition, temperature, star-formation rate, mean stellar age, and so on, both within and across galaxies. He asked me about some difficult situations in undertanding empirical correlations in a high dimensional space, and (even harder) how to derive causal conclusions. As my loyal reader might guess, I wasn't much help! I handed him a copy of Regression and Other Stories and told him that it's going to get harder before it gets easier! But damn what a beautiful data set.


is the simulation hypothesis a physics question?

Against my better judgement, I am writing a paper on the question of whether we live inside a computer simulation. Today I was discussing this with Paula Seraphim (NYU), who has been doing research with me on this subject. We decided to re-scope the paper around the question “Is the simulation hypothesis a physics question?” instead of the direct question “Do we live in a simulation?”, which can't be answered very satisfactorily. But I think when you flow it down, you conclude that this question is, indeed, a physics question! And the simulation hypothesis motivates searches for new physics in much the same way that the dark matter and inflation do: The predictions are not specific, but there are general signatures to look for.


stating a transfer-learning problem

I am trying to re-state the problem of putting labels on SDSS-IV APOGEE spectra as a transfer learning problem, since the labels come from (slightly wrong) stellar models. Or maybe domain adaptation. But the form of the problem we face in astronomy is different from that faced in most domain-adaptation contexts. The reasons are: The simulated stars are on a grid, not (usually) drawn from a realistically correct distribution. There are only labels on the simulated data, not on the real data (labels only get to real data through simulated data). And there are selection effects and noise sources that are unique to astronomy.


geometry of gradients and second derivatives

Building on conversations we had yesterday about the geometry and topology of gradients of a scalar field, Gaby Contardo (Flatiron) and I worked out at the end of the day today that valleys of a density field (meaning here a many-times differentiable smooth density model in some d-dimensional space) can be traced by looking for paths along which the density gradient has zero projection onto the principal component (largest-eigenvalue eigenvector) of the second-derivative tensor (the Hessian, to some). We looked at some toy-data examples and this does look promising as a technique for tracing or finding gaps or low-density regions in d-dimensional point clouds.


massive revision of a manuscript

Teresa Huang's paper with Soledad Villar and me got a very constructive referee report, which led to some discoveries, which led to more discoveries, which led to a massive revision and increase in scope. And all under deadline, as the journal gave us just 5 weeks to respond. It is a really improved paper, thanks to Huang's great work the referee's inspiration. Today we went through the changes. It's hard to take a paper through a truly major revision: Everything has to change, including the parts that didn't change! Because: Writing!



Today Soledad Villar (JHU) and I discussed the posssibility of building something akin to a graph neural network, but that takes advantage of the n log(n) scaling of a fast multipole method hierarchical summary graph. The idea is to make highly connected or fully connected graph neural networks fast through the same trick that the FMM works: By having nearby points in the graph talk precisely, but have distant parts talk through summaries in a hierarchical set of summary boxels. We think there is a chance this might work, in the context of the work we are doing with Weichi Yao (NYU) on gauge-invariant graph neural networks. The gauge invariance is such a strict symmetry, it might permit transmitting information from distant parts of the graph through summaries, while still preserving full (or great) generality. We have yet to figure it all out, but we spent a lot of time drawing boxes on the board.


we were wrong about the lower main sequence

I wrote ten days ago about a bimodality in the lower main sequence that Hans-Walter Rix (MPIA) found a few weeks ago. I sent it to some luminaries and my very old friend John Gizis (Delaware) wrote back saying that it might be issues with the ESA Gaia photometry. I argued back at him, saying: Why would you not trust the Gaia photometry, it is the world's premier data on stars? He agreed, and we explored issues of stellar variability, spectroscopy, and kinematics. But then, a few days later, Gizis pointed me at figure 29 in this paper. It looks like we just rediscovered a known data issue. Brutal! But kudos to Gizis for his great intuition.


two kinds of low-mass stars

I showed the Astronomical Data Group meeting the bifurcation in the lower main sequence that Hans-Walter Rix (MPIA) found a few weeks ago. Many of the suggestions from the crew were around looking at photometric variability: Does one population show different rotation or cloud cover or etc than the other?


not much to report

not much! Funny how a day can be busy but not involve any things that I'd call research.


no more randoms

In large-scale-structure projects, when galaxy (or other tracer) clustering is measured in real space, the computation involves spatial positions of the tracers, and spatial positions of a large set of random points, distributed uniformly (within the window function). These latter points can be thought of as a comparison population. However, it is equally true that they can be thought of as performing some simple integrals by Monte Carlo method. If you see them that way—as a tool for integrating—it becomes obvious that there must be far better and far faster ways to do this! After all, non-adaptive Monte Carlo methods are far inferior to even stupidly adaptive schemes. I discussed all this with Kate Storey-Fisher (NYU) yesterday and today.


writing about data-driven spectrophotometric distances

I wrote like mad in the paper that describes what Adrian Price-Whelan (Flatiron) and I are currently doing to estimate stellar distances using SDSS-IV APOGEE spectra (plus photometry). I wrote a long list of assumptions, with names. As my loyal reader knows, my position is that if you get the assumptions written down with enough specificity, the method you are doing becomes the only thing you can do. Or else maybe you should re-think that method?


setting hyper-parameters

Adrian Price-Whelan (Flatiron) and I are working on data-driven distances for stars in the SDSS-IV APOGEE data. There are many hyper-parameters of our method, including the number K of leave one-Kth-out splits of the data, the regularization amplitude we apply to the spectral part of the model (it's a generalized linear model), and the infamous Gaia parallax zero-point. These are just three of many, but they span an interesting range. One is purely operational, one restricts the fit (introduces bias, deliberately), and one has a true value that is unknown. How to optimize for each of these? It will be different in each case, I expect.


a split in the main sequence?

I did some actual, real-live sciencing this weekend, which was a pleasure. I plotted a part of the lower-main sequence in ESA Gaia data where Hans-Walter Rix (MPIA) has found a bimodality that isn't previously known (as far as we can tell). I looked at whether the two different kinds of stars (on each side of the bimodality) are kinematically different and it doesn't seem like it. I sent the plots to some experts to ask for advice about interpretation; this is out of scope for both Rix and me!


predicting the future of a periodic variable star

Gaby Contardo (Flatiron) showed me an amazingly periodic star from the NASA Kepler data a few days ago, and today she showed me the results of trying to predict points in the light curve from prior points in the light curve (like in a recurrent method). When the star is very close to periodic, and when the region of the star used to predict a new data point is comparable in length to the period or longer, then even linear regression does a great job! This all relates to auto-regressive processes.


we have new spectrophotometric distances

After a couple of days of hacking and data munging—and looking into the internals of Jax—Adrian Price-Whelan and I produced stellar distance estimates today for a few thousand APOGEE spectra. Our method is based on this paper on linear models for distance estimation with some modifications inspired by this paper on regression. It was gratifying! Now we have hyper-parameters to set and valication to do.


what is a bolometric correction?

Today Katie Breivik (Flatiron) asked me some technical questions about the bolometric correction. It's related to the difference between a relative magnitude in a bandpass and the relative magnitude you would get if you were using a very (infinitely) broad-band bolometer. Relative magnitudes are good things (AB magnitudes, in contrast, are bad things, but that's for another post): They are relative fluxes between the target and a standard (usually Vega). If your target is hotter than Vega, and you choose a very blue bandpass, the bandpass magnitude of the star will be smaller (relatively brighter) than the bolometric magnitude. If you choose a very red bandpass, the bandpass magnitude will be larger (relatively fainter) than the bolometric magnitude. That's all very confusing.

And bolometric is a horrible concept, since most contemporary detectors are photon-counting and not bolometric (and yes, that matters: the infinitely-wide filter on a photon-counting device gives a different relative magnitude than the infinitely-wide filter on a bolometer). I referred Breivik to this horrifying paper for unpleasant details.


low-pass filter for non-uniformly sampled data

Adrian Price-Whelan (Flatiron) and I used the new fiNUFFT non-uniformly-sampled fast Fourier tranform code to build a low-pass filter for stellar spectra today. The idea is: There can't be any spectral information in the data at spectral resolutions higher than the spectrograph resolution. So we can low-pass filter in the log-wavelength domain and that should enforce finite spectral resolution. The context is: Making features to use in a regression or other machine-learning method. I don't know, but I think this is a rare thing: A low-pass filter that doesn't require uniformly or equally-spaced sampling in the x direction or time domain.


correcting wrong simulations, linear edition

Soledad Villar (JHU) and I spent some time today constructing (on paper) a model to learn simultaneously from real and simulated data, even when the simulations have large systematic problems. The idea is to model the joint distribution of the real data, the simulated data, and the parameters of the simulated data. Then, using that model, infer the parameters that are most appropriate for each real data point. The problem setup has two modes. In one (which applies to, say, the APOGEE stellar spectra), there is a best-fit simulation for each data example. In the other, there is an observed data set (say, a cosmological large-scale structure survey) and many simulations that are relevant, but don't directly correspond one-to-one. We are hoping we have a plan for either case. One nice thing is: If this works, we will have a model not just for APOGEE stellar parameter estimation, but also for the missing physics in the stellar atmosphere simulations!


stellar flares

Gaby Contardo (Flatiron) and I have been trying to construct a project around light curves, time domain, prediction, feature extraction, and the arrow of time, for months now. Today we decided to look closely at a catalog of stellar flares (which are definitely time-asymmetric) prepared by Jim Davenport (UW). Can we make a compact or sparse representation? Do they cluster? Do those properties have relationships with stellar rotation phase or other context?



astronomy in film

One of my jobs at NYU is as an advisor to student screenwriters who are writing movies that involve science and technology. I didn't get much research done today, but I had a really interesting and engaging conversation with film-writers Yuan Yuan (NYU) and Sharon Lee (NYU) who are writing a film that involves the Beijing observatory, the LAMOST project, and the Cultural Revolution. I learned a lot in this call!


when do you ever sum over all the entries in a matrix?

Imagine you have $n$ measurements of a quantity $y$. What is your best estimate of the value of $y$? It turns out that if you have an estimate for the covariance matrix of $y$, the information in (expected inverse variance from) your $n$ data points is given by the sum of the entries of the inverse of that covariance matrix. This fact is obvious in retrospect, but also confused me, since this is such a non-coordinate-free thing to do to a matrix!


why make a dust map? and Bayesian model elaboration

Lauren Anderson (Carnegie) and I had a wide-ranging conversation today. But part of it was about the dust map: We have a project with statisticians to deliver a three-dimensional dust map, using a very large Gaussian-process model. Right now the interesting parts of the project are around model checking and model elaboration: How do you take a model and decide what's wrong with it, in detail. Meaning: Not compare it to other models (that's a solved problem, in principle), but rather, compare it to the data and see where it would benefit from improvement.

One key idea for model elaboration is to check the parts of the model you care about and see if those aspects are working well. David Blei (Columbia) told us to climb a mountain and think on this matter, so we did, today. We decided that our most important goals are (1) to deliver accurate extinction values to stellar targets, for our users, and (2) to find interesting dust structures (like spiral arms) if they are there in the data.

Now the challenge is to convert these considerations into posterior predictive checks that are informative about model assumptions. The challenge is that, in a real-data Bayesian inference, you don't know the truth! You just have your data and your model.


best RV observing strategies

I really solidly did actual coding today on a real research problem, which I have been working on with Megan Bedell (Flatiron) for a few years now. The context is: extreme precision radial-velocity surveys. The question is: Is there any advantage to taking one observation every night relative to taking K observations every K nights? I succeeded!

I can now show that the correlations induced in adjacent observations by asteroseismic p-modes makes it advantageous to do K observations every K nights. Why? Because you can better infer the center-of-mass motion of the star with multiple, coherently p-mode-shifted observations. The argument is a bit subtle, but it will have implications for Terra Hunting and EXPRES and other projects that are looking for long-period planets.


EPRV capabilities

I had a conversation with Jacob Bean (Chicago) and Ben Montet (UNSW) about various radial-velocity projects we have going. We spent some time talking about what projects are better for telescopes of different apertures, and whether there is any chance the EPRV community could be induced to work together. I suggested that the creation of a big software effort in EPRV could bring people together, and help all projects. We also talked about data-analysis challenges for different kinds of spectrographs. One project we are going to do is get a gas cell component added in to the wobble model. I volunteered Matt Daunt (NYU) in his absence.


asteroseismic p-mode noise mitigation

I had a call with part of the HARPS3 team today, the sub-part working on observations of the Sun. Yes, Sun. That got us arguing about asteroseismic modes and me claiming that there are better approaches for ameliorating p-mode noise in extreme precision radial-velocity measurements than setting your exposure times carefully to null the modes. The crew asked me to get specific, so I had a call with Bedell (Flatiron) later in the day to work out what we need to assemble. The issues are about correlated noise: Asteroseismic noise is correlated; those correlations can be exploited for good, or ignored for bad. That's the argument I have to clearly make.


the Lasso as an optimizer

In group meeting, and other conversations today, I asked about how to optimize very large parameter vector, when my problem is convex but has an L1 term in the norm. Both Gaby Contardo (Flatiron) and Soledad Villar (JHU) said: Use the standard Lasso optimizer. At first I thought “but my problem doesn't have exactly the Lasso form!”. But then I realized that it is possible to manipulate the operators I have so that it has exactly the Lasso form, and then I can just use a standard Lasso optimizer! So I'm good and I can proceed.


can we turn an image into a colormap?

I talked to Kate Storey-Fisher (NYU) about a beautiful rainbow quartz rock that she has: It is filled with colors, in a beautiful geological palette. Could we turn this into a colormap for making plots, or a set of colormaps? We discussed considerations.


constructing a bilinear dictionary method for light curves

After having many conversations with Gaby Contardo (Flatiron) and Christina Hedges (Ames) about finding events of various kinds in stellar light curves (from NASA Kepler and TESS), I was reminded of dictionary methods, or sparse-coding methods. So I spent some time writing down a possible sparse-coding approach for Kepler light curves, and even a bit of time writing some code. But I think we probably want something more general than the kind of bilinear problem I find it easy to write down: I am imagining a set of words, and a set of occurrences (and amplitudes) of those words in the time domain. But real events will have other parameters (shape and duration parameters), which suggests using more nonlinear methods.


discovering and measuring horizon-scale gradients in large-scale structure

Kate Storey-Fisher (NYU) and I are advising an undergraduate research project for Abby Williams (NYU) in cosmology. Williams is looking at the question: How precisely can we say that the large-scale structure in the Universe is homogenous? Are there gradients in the amplitude of galaxy clustering (or other measures)? Her plan is to use Storey-Fisher's new clustering tools, which can look at variations in clustering without binning or patchifying the space. In the short term, however, we are starting in patches, just to establish a baseline. Today things came together and Williams can show that if we simulate a toy universe with a clustering gradient, she can discover and accurately measure that gradient, using analyses in patches. The first stage of this is to do some forecasts or information theory.


machine learning and ODEs

Today Soledad Villar (JHU) and I discussed different ways to structure a machine-learning method for a cosmological problem: The idea is to use the machine-learning method to replace or emulate a cosmological simulation. This is just a toy problem; of course I'm interested in data analysis, not theory, in the long run. But we realized today that we have a huge number of choices about how to structure this. Since the underlying data come from an ordinary differential equation, we can structure our ML method like an ordinary differential equation, and see what it finds! Or we can give it less structure (and more freedom) and see if it does better or worse. That is, you can build a neural network that is, on the inside, a differential equation. That's crazy. Obvious in retrospect but I've never thought this way before.


elaborating the causal structure of wobble

Lily Zhao (Yale) and Megan Bedell (Flatiron) and I are working on measuring very precise radial velocoties for very small data sets, where (although there are hundreds of thousands of pixels per spectrum) there are only a few epochs of observations. In these cases, it is hard for our data-driven method to separate the stellar spectrum model from the telluric spectrum model—our wobble method makes use of the independent covariances of stellar and telluric features to separate the star from the sky. So we discussed the point that really we should use all stars to learn the (maybe flexible) telluric model). That's been a dream since the beginning (it is even mentioned in the original wobble paper), but execution requires some design thinking: We want the optimizations to be tractable, and we want the interface to be sensible. Time to go to the whiteboard. Oh wait, it's a pandemic.


Cannon, neural network, physical model

In my weekly meeting with Teresa Huang (JHU) and Soledad Villar (JHU), we went through our methods for putting labels on stellar spectra (labels like effective temperature, surface gravity, and metallicity). We have all the machinery together now to do this with physical models, with The Cannon (a data-driven generative model), and with neural networks (deep learning, or other data-driven discriminative models). The idea is to see how well these different kinds of models respect our beliefs about stars and spectroscopic observations, and how they fit or over-fit, as a function of training and model choices. We are using the concept of adversarial attacks to guide us. All our pieces are in place now to do this full set of comparisons.