Today I had the pleasure of sitting on the PhD defense of Dou Liu (NYU), who has been working on AGN in the centers of galaxies, using *MANGA* data from *SDSS-IV*. The part of Liu's thesis that is most exciting to me (perhaps not surprisingly) is the technical chapter, in which he finds a new method for combining irregularly dithered integral-field-unit spectroscopy exposures into a full data cube, with sky coordinates on two axes, and wavelength on the third. In this final data cube, his method gets much better final resolution, and lower pixel-to-pixel covariances in the noise, relative to the standard pipelines. His trick? He has generalized *spectro-perfectionism* (invented for spectral extraction) to the multi-dimensional spectral domain. It's beautiful stuff, and has implications for all sorts of imaging and spectroscopy projects going forward. Congratulations Dr. Liu and thank you!

## 2021-07-29

### Dr Dou Liu

## 2021-07-28

### my own special FMM algorithm

I was in a location with no internet and no computing, so I spent an hour or so writing down what I think is the fast multipole method. It involves building an octtree, balanced in volume (not necessarily point content), and computing recursively the multipole amplitudes in all nodes (starting with the points in the leaves). Once that is done, at evaluation time, you do different things at different levels, depending on the radius out to which things are computed exactly. One thing I'm interested in is: Can you simplify that evaluation if you, say, build multiple trees?

## 2021-07-27

## 2021-07-26

### exoplanet atmospheres projects

I had a nice conversation today with Laura Kreidberg (MPIA) and Jason Dittmann (MPIA) about projects in ultra-precise spectroscopy for exoplanet-atmosphere science. I was pushing for projects in which we get closer to the metal (moving spectroscopy to two dimensions), and in which we use theoretical ideas to improve extraction of very weak spectral signals. Kreidberg was pushing for projects in which we use planet kinematics to separate very weak planet signals from star signals. On the latter, I recommended starting with co-ads of residuals.

## 2021-07-25

### units symmetry and dimensionless scalar quantities

I got in some quality time with Bernhard Schölkopf (MPI-IS) this weekend, in which we discussed many things related to contemporary machine learning and its overlap with physics. In particular, we discussed the point that the nonlinearities in machine-learning methods (like deep learning) have to be interpretable as nonlinear functions of dimensionless scalars, if the methods are going to be like laws of physics. That is a strong constraint on what can be put into nonlinear functions. We understood part of that in our recent paper, but not all of it. In particular, thinking about units or dimensions in machine learning might be extremely valuable.

## 2021-07-22

### band-limited image models

I had my quasi-weekly call with Dustin Lang (Perimeter) today, in which we discussed how to represent the differences between images, or represent models of images. I am obsessing about how to model distortions of spectrograph calibration images (like arcs and flats), and Lang and I endlessly discuss difference imaging to discover time-variable phenomena and moving objects. One thing Lang has observed is that sometimes a parametric model of a reference image makes a better reference image for image differencing than the reference image itself. Presumably this is because the model de-noises the data? But if so, could we just use a band-limited non-parametric model say? We discussed many things related to all that.

## 2021-07-21

### Gaia BP-RP spectral fitting

Hans-Walter Rix and I discussed with Rene Andrae (MPIA) and Morgan Fouesneau (MPIA) how they are fitting the (not yet public) BP-RP spectra from the ESA *Gaia* mission. The rules are that they can only use certain kinds of models and they can only use *Gaia* data, nothing external to *Gaia*. It's a real challenge! We discussed methods and ideas for verifying and testing results.

## 2021-07-20

### more MPIA dust mapping

At Milky Way Group Meeting at MPIA, Thavisha Dharmawareda (MPIA) showed her first results on building pieces of a three-dimensional dust map from observed extinctions/attenuations to stars. As usual, the problem is to infer a three-dimensional map, preferably non-parametrically, from measurements of line-of-sight integrals through the map. She uses a Gaussian process, variational inference, and inducing points. She has some nice features in her maps (she started with star-formation regions with interesting morphologies). She sees extinction-law variations too; we discussed how those might be incorporated.

## 2021-07-19

### latent-variable model for black-hole masses

This morning Christina Eilers showed me that a Gaussian-process latent-variable model seems to be able to predict quasar spectra and black-hole masses, such that she can perform inferences to learn the black-hole masses just from the bolometric luminosities and the spectral shapes redward of Lyman alpha.

## 2021-07-18

### representing spectrograph calibration images

In a spectrograph, many of the the raw calibration images are taken with something like arcs or flat-field lamps illuminating the slits or fibers, so that the spectral traces are lit up with calibration light. If the spectrograph flexes or changes, these traces move in two dimensions on the spectrograph image plane, or relative to the detector pixels. I am looking at whether we could have a *generative model* for these calibration images, so I spent the weekend toying with ways to represent the relevant kinds of small distortions of images. If the distortions are extremely small (as they are, say, for *EXPRES*) they can be represented with just a compact linear basis. But if they are a few pixels (as they appear to be for *BOSS*), I need a better representation.

## 2021-07-16

### talk on instrument calibration

I spent most of the day preparing this talk on instrument calibration, which I gave in the Königstuhl Colloquium this afternoon. I got excellent questions and I actually enjoyed it, despite the zoom/hybrid format, which doesn't always work well.

## 2021-07-15

### maps of Hessian eigenvalues for gap-finding

As my loyal reader knows, Gaby Contardo (Flatiron) and I have been looking for gaps (valleys, voids) in point clouds using geometric methods on density estimates. Today she just did the very simplest thing of estimating the largest eigenvalue of the second-derivative tensor (Hessian of density with respect to position), and visualizing it for different density estimates (different bandwidths) and different bootstrap resamplings of the data. It is obvious, looking at these plots, that we can combine these maps into good gap-finders! This is simpler than our previous approaches, and will generalize better to higher dimensions. It's also slow, but we don't see anything that can be fast, especially in “high” dimensions (high like 3 or 4!!).

## 2021-07-14

### a sandbox for Milky Way dynamics

Today I met with Micah Oeur (Merced) and Juan Guerra (Yale) to discuss the project they are doing as part of the Flatiron summer school on stellar dynamics. Their project is to make a sandbox for testing different methods for inferring force laws (or gravitational potentials or mass distributions) from stellar kinematic data. We are starting by building very simple potentials (like literally simple one-d potentials like the simple harmonic oscillator) and very simple distribution functions (like isothermal) and seeing how the different methods (virial, Jeans, Schwarzschild, and torus imaging) work. The medium-term goal is to figure out how these methods are sensitive to their assumptions, and robust to data that violate those assumptions. Also some information theory!

## 2021-07-13

### how much calibration data does a spectrograph need?

There was an *SDSS-V* operational telecon today, in which we discussed the plans for the first year of data, and how those plans should depend on, or be conditional on, what we learn in the commissioning phase. One of the most important things about *SDSS-V* is that it is robotic and fiber-fed, so we can move fast and do time domain things. But how fast? This depends, in turn, on how much calibration data we need as a function of time. I proposed that we could ask how well we can *synthesize* calibration data in one telescope configuration at one time, given calibration data from other telescope configurations at other times. This is not unlike the approach we took to calibrating *EXPRES*. So of course it was put back to me: Design the experiments that will answer this question! The main issue is that the *BOSS* spectrogaphs hang off the back of the pointing, tracking telescope.

## 2021-07-12

### optimal sizes of sails?

I spent some time in-between vacationing to look at the relative sizes of sails and keels on sailboats (yes, sailboats). I find that a boat sailing cross-wind sails fastest when the ram-pressure force prefactor (effective area times density) of the sail and the keel are comparable. That is, you want the effective area of the sail to be something like 800 times larger than the effective area of the keel! Strange, but maybe not false for the fastest competition sailboats?

## 2021-07-07

### nothing really

Just some writing in my physics-of-sailing project, and getting ready for a short vacation (gasp!).

## 2021-07-06

### PCA and convex optimization

Soledad Villar (JHU) and I talked about bi-linear problems today, in the context of instrument calibration and computer vision. We looked at the kinds of optimizations that are involved in these problems. She showed me that, if you think of PCA as delivering a projection operator (that is, not the specific eigenvectors and amplitudes, but just the projection operator you would construct from those eigenvectors), that projection operator can be derived from a convex optimization in which the objective is purely mean squared error. That was news to me!

## 2021-07-05

### physics-of-sailing literature

I sucked it up and read a bunch of the physics-of-sailing literature today (and on the weekend). Some of the books very correctly attribute the forces on sails and wings to momentum transport. Some of the books very incorrectly attribute them to differences of pressure calculable from Bernoulli effect alone. But in reading it all, I did come to the conclusion that no-one is working in precisely the space we want to work, so I do think there is a (correctly scoped) paper to write. Of course even if there weren't, I couldn't stop myself!

## 2021-07-02

### statistics translation project?

I had a wide-ranging conversation today with former NYU undergraduate Hilary Gao. One thing we discussed is the idea that physicists (and biologists, chemists, and so on) know a lot about statistics and evidence, and yet often find it hard to understand social-science research (like the epidemiology around Coronavirus and the data around race and policing, for two contemporary examples). This is (in my opinion) partly because the social-science literature involves aspects of causal inference that are fundamental, but don't appear in the same form in the natural sciences. We discussed what it would take to usefully write about or intervene into this quasi-translation project.

## 2021-07-01

### the geometry of gaps in point clouds

Gaby Contardo (Flatiron) and I had our weekly today on our geometric data-analysis methods for finding and characterizing gaps or valleys in point clouds. We are starting at 2D data, which is the simplest case, and it is interesting and we have nice results. But scaling up to more dimensions is hard. For one, there is the curse of dimensionality, such that anything that relies on, or approximates, density estimation gets hard fast. And for another, the kinds of geometric structures or options for gaps blows up combinatorically (or faster than linearly anyway) with the number of dimensions. Do we have to enumerate the possibilities and track them all? Or are there more clever things? We don't yet have answers, even for 3D, let alone 6D!