I had a great conversation today with Matt Daunt (NYU), building on discussion yesterday with also Megan Bedell (Flatiron), about how to simulate data from an extreme-precision radial-velocity spectrograph. We decided to simulate the star, the atmosphere, and the (gasp!) gas cell) all at very high resolution, then combine them physically, then reduce resolution to the spectrograph resolution (which is very high nonetheless) and then sample and noisify the resulting data. The idea is: Make the structure of the code like the structure of our physical beliefs, or causal beliefs. We decided to fork this data simulation into its own project.
As my loyal reader knows, I am interested in fast-multipole method and whether it could be used to improve or speed machine-learning methods on graphs or spatial point clouds. Over the last months, I have learned about lots of limitations of FMMs, some of which we discuss here. I'm still interested! But when I last spoke with Leslie Greengard (Flatiron) he indicated that he felt like if you want to take FMMs scale up to very clustered data in high dimensions, maybe you have to think of truly adaptive trees (not the fixed tree of an FMM), like perhaps kd-trees. Today Soledad villar (JHU) and I discussed this idea. The question is: What could be proved about such an approach, or are there such approaches where you could get things like accuracy guarantees? The FMM has the beautiful property that you can compute the precision of your approximation, and dial up the order to get better precision.
Today Christina Eilers (MIT) updated Hans-Walter Rix (MPIA) and me on our project to self-calibrate the element-abundance measurements in APOGEE. We are looking at self-consistency of the abundance distribution as a function of actions; in a well-mixed Galaxy this could be used to calibrate the biases of the abundance measurements with surface gravity (a known effect in the data) and spectral resolution (a possible effect). Eilers has beautiful results: The abundances get better and the abundance gradients in the Galaxy (with radius or azimuthal action, and with vertical height or vertical action) become more clear and more sensible. So we have a paper to write!
Today Soledad Villar (JHU), Kate Storey-Fisher (NYU), Weichi Yao (NYU), and I crashed the machine-learning group meeting hosted by Shirley Ho (Flatiron) and Gaby Contardo (Flatiron). Villar presented our new paper on gauge-invariant functions and we started the conversation about what to do with it. We vowed to come back to the meeting to discuss that: What are the best applications of machine learning in cosmology and astrophysics right now?
I've had a lifetime of conversations with Hans-Walter Rix (MPIA) about the point that you could in principle sail with a sailboat with flat sails: Nothing about the curvature of the sails is integral or required by sailing. The curvature helps, but isn't necessary. I have had another lifetime of conversations with Matt Kleban (NYU) about the point that sailing depends on the relative velocity between the air and the water, and this leads to some hilarious physics problems involving sailing on rivers in zero wind (it's possible because a flowing river is moving relative to the dead air).
These worlds collided this weekend because—inspired by a twitter conversation—I finally built a proper ram-pressure model of a flat-sail, flat-keel sailboat and got it all working. It's sweet! It sails beautifully. Much more to say, but question is: Is there a paper to write?
I worked today with Katherine Alsfelder (NYU) to develop statistics on APOGEE spectra: There are two spectrographs (one in the North and one in the South) and there are 300 fibers per spectrograph. How many stars have been observed in each of the 600 different options, and how many of the 600-choose-2 options have seen the same star? This all in preparation for empirical cross-calibration of the spectrographs. There is a lot of data! But 600-choose-2 is a huge number.
Today I gave a colloquium at the University of Cambridge. My slides are here. I spoke about how to make precise measurements, how to design surveys, and how to exploit structure in noise. It's a rich set of things, and most of the writing about information theory in astronomy is only in the cosmology domain. Time to change that, maybe? It is also the case that the best book about information and inference ever written was written in Cambridge! So I was bringing coals to Newcastle, ish!
Today I spoke at the “meeting-in-meeting” on machine learning at the summer AAS meeting. My slides are here. I started out a bit negative but I ended up saying very positive things about what machine learning can do for astrophysics. I got as much feedback on the twitters afterwards (maybe more) than I did in real time. Several of the other speakers in my session mentioned or discussed contrastive learning, which looks like it might be an interesting unsupervised technique.
I'm giving two talks this week, one at #AAS238 and one at the University of Cambridge. Because I am a masochist (?) I put in titles and abstracts for both talks that are totally unlike those for any talks I have given previously. So I have to make slides entirely from scratch! I spent every bit of time today not in meetings working on slides. I'm not at all ready!
One of my PhD advisors—my official advisor—was Roger Blandford (now at Stanford). Blandford, being old-school, responded to a tweet thread I started by sending me email. I am trying to move over to always describing tensors and rotation operators and Lorentz transformations and the like in terms of unit vectors, and I realized that the most enlightened community along these lines are the quantum mechanics. Probably because they work in infinite-dimensional spaces often! Anyway, there are deep connections between vectors in a space and functions in a Hilbert space. I'm still learning; I think I will never fully get it.
Adrian Price-Whelan and I discussed today some oddities that Matt Daunt (NYU) is finding while trying to measure radial velocities in extremely noisy, fast APOGEE sub-exposures. He finds that the objective function we are using is not obviously smooth on 10-ish km/s velocity scales. Why not? We don't know. But what we do know is that a spectrograph with resolution 22,500 cannot put sharp structures into a likelihood function on scales smaller than about 13 km/s.
There's a nice paradox here, in fact: The spectrograph can't see features on scales smaller than 13 km/s, and yet we can reliably measure radial velocities much better than this! How? The informal answer is that the radial-velocity precision is 13 km/s divided by a certain, particular signal-to-noise. The formal answer involves information theory—the Fisher information, to be precise.
I had the great honor to be on the PhD committee of Lily Zhao (Yale), who defended her dissertation today. It was great and remarkable. She has worked on hardware, calibration, software, stellar astrophysics, and planets. Her seminar was wide-ranging, and the number and scope of the questions she fielded was legion. She has already had a big impact on extreme precision radial-velocity projects, and she is poised to have even more impact in the future. One of the underlying ideas of her work is that EPRV projects are integrated hardware–software systems. This idea should inform everything we do, going forward. I asked a million technical questions, but I also asked questions about the search for life, and the astronomical community's management and interoperation of its large supply of diverse spectrographs. In typical Zhao fashion, she had interesting things to say about all these things.
Soledad Villar (JHU) and I discussed more the problem of orthogonalization of vectors—or finding orthonormal basis vectors that span a subspace—in special (and general) relativity. She proposed a set of hacks that correct the generalization of Gram–Schmidt orthogonalization that I proposed a week or so ago. It's complicated, because although the straightforward generalization of GS works with probability one, there are cases you can construct that bork completely. The problem is that the method involves division by an inner product, and if the vector becomes light-like, that inner product vanishes.
In a heroic final push, Soledad Villar (JHU) finished our paper for NeurIPS submission today. We showed that you can make gauge-invariant neural networks without using the irreducible representations of group theory, or any other heavy computational machinery, at least for large classes of groups. Indeed, for all the groups that appear in classical physics (orthogonal group, rotations, Euclidean, Lorentz, Poincaré, permutation). Our contribution is pure math! It is only about machine learning inasmuch as it suggests future methods and simplifications. We will post it to arXiv next week.