GPRV, day 4

Today was day 4 and the last day of GPRV in Oxford. The day ended with a discussion led by Heather Cegla (Warwick) and Jennifer Burt (JPL) about EPRV and national priorities. Exoplanet science is obviously extremely important to the 2021 Decadal Survey, but in detail, the first seven chapters of that Survey (the chapters to which NASA and NSF must respond) do not actually mention radial velocity! The conversation in the room today was extremely wide-ranging; it covered hardware, software, science goals, and community-building goals. But it also covered months, years, and decade-long time-scales.

The highest level recommendation of the Decadal Survey was that we need to do preparatory work to design and assess feasibility of a large IR–visible–UV telescope that will discover habitable worlds. There is no doubt (I think it's uncontroversial) that this preparatory work will require lots and lots of EPRV science and observations. Of course the fact that this is obvious is separated somewhat from the question of whether there will be abundant funding!

It will come as no surprise to my loyal reader that I was a proponent, in this discussion, of building open-science communities around open data, open-source software, and open science collaborations. I think we have so much evidence now that open-science communities science way better. What I loved is that there were absolutely no objections in the room to this idea. The only controversies were about exactly how open data should be managed and released in that utopian future. I'm optimistic about this business!

And I thank Suzanne Aigrain (Oxford) and her OC for a great meeting!


GPRV, day 3

Day 3 of GPRV continued great! There were a few talks and discussions of very young stars that got everyone in the room quite excited, from Di Maio (INAF), Suárez Mascareño (IAC), and Nielsen (Oxford). The activity signals are huge, but the planets are extremely interesting, so how do we approach this? Tons of of observing time? Cleverness? Give up? Of course I think it is so important to understand how planetary systems form and evolve, I would be willing to spend the telescope time.

In the morning, Luger (Flatiron) gave a seminar and then a tutorial about modeling stellar surfaces and predicting spectroscopic quantities. The tutorial was fun; his code Starry does everything an astronomer could want, and beautifully (and, of course, blazingly fast). We had fun playing with it in a group hacking session.


GPRV, day 2

Today was day two of GPRV. It was a delight! Here's another highly unfair summary of the day:

Hara (Geneva) kicked it off with a discussion of a Bayesian-decision-theory-like method for deciding on the reality and correctness of exoplanet discoveries. He made clever choices to deliver really strong probabilistic results. I was about to object to all this but then he disarmed me at the end of the talk by noting that everything is extremely sensitive to noise models and that is the biggest issue. He gave some chilling examples.

Shahaf (Tel Aviv) showed some very nice results in old-school statistics that generalize the periodogram to a correlation between phase differences and distances between pairs of any quantities you like, as a function of period. This can be used to perform causal inferences for periods seen in naive periodograms. He uses a very interesting phase variable for these phase differences; this is extremely relevant to things I have been discussing with Zhao and Bedell.

Mortier (Cambridge) mesmerized the room with her work with six (yes 6) years of solar data from HARPS-N (maybe). She can show amazing relationships between pipeline RVs, activity indicators, and spectral shape measures. But she showed that often the correlations are not at zero-time-lag. Often the correlations are strongest with delays of 1/9 to 1/8 of a rotation period. When she sub-samples the data to typical kinds of long-term monitoring campaigns we are going to do on distant stars, it is a bit scary. That led to a lot of discussion over lunch and dinner.

Zhao (Flatiron), Bucchave (DTU), and Dumusque (Geneva) led discussions on community-building, hardware, instrument calibration, and other things. The meeting is set up for lots of discussion and is itself an extremely good example of a community-building activity, around the hard challenges of EPRV. I opined in one of these sessions that EPRV now looks like cosmology around 2000, when everything was just about to go open and the world community started working together. This meeting is part of this change that we want to see.

Finally, a theme of the day was representations for spectral signals. Dumusque (inadvertently, I think) made a strong case that we should be working in the 2-d spectrograph images directly! That's music to my heart. He also emphasized that the stellar surface is a complex physical place. I agree! And Cretignier (Geneva) showed a beautiful representation of the spectral residuals to disentangle Doppler and spectral-variability variations. I think his work and Shahaf's could be combined in interesting ways; I am excited to get back to the lab.


GPRV, day 1

The GPRV meeting started in Oxford today. The meeting brings together people working on data analysis in extreme precision radial-velocity projects, but united by interests in and uses of Gaussian processes. The first day ended with a very nice tutorial by Foreman-Mackey (Flatiron) on applied-math and computational tools for scalable Gaussian processes. He even live-coded and blew everyone's mind with Python jax.

Many talks (including Barragán (Oxford), Delisle (Geneva), and Tran (UT Austin) to name a few) are using Gaussian processes and their derivatives or two Gaussian processes to model the star's variability, with photometry, radial-velocity measurements, and activity indicators modeled as linear combinations of these latent processes. That's a really interesting theme, and connects somehow to my evil plan (with Bedell, Luger, Zhao, et al) of modeling the whole stellar surface. It is definitely an exciting time.

One issue that came up is how to judge or assess over-fitting. There was no consensus or answer, and most of the GP practitioners are very Bayesian. But Bayesian approaches aren't always sensitive to true statistical violations of the model; I want to see some cross-validation in this house.

In other news, Halverson (JPL) told us about publicly available solar data (and lots of it) from NASA NEID. I might want to play with that when I get home!


the black-body law

Today Miles Cranmer (Princeton) chatted with Weichi Yao (NYU), Soledad Villar (JHU), and me about things related to symbolic regression, dimensional analysis, and so on. He brought up a very interesting problem in the history of physics: The black-body radiation law, which is attributed to Planck. Planck knew about temperature, wavelength, the speed of light, and Boltzmann's constant k. Dimensionally, these can be combined into only one thing that has units of intensity, and that one thing is the long-wavelength black-body law. At short wavelengths, the behavior can't be explained without the introduction of a new constant, and that constant has to have non-trivial dimensions (units). He figured it out, and that constant ended up governing the hydrogen atom spectrum, quantum mechanics, and everything else. Indeed that constant, h, bears his hame. Could we have learned this ourselves just from the data directly with a machine? After all, that's what Planck did, right?


six-quark state?

Today was a great blackboard talk at CCPP by Glennys Farrar (NYU) about a possible six-quark state in QCD. She has been thinking about this for a decade or so, because it might have implications for dark matter and issues in QCD. Today she focused on the latter: There are terms in the g−2 calculation for the muon that can be estimated either with lattice QCD or by integrating some observed branching ratios from experiment. These two methods disagree, and the observational method disagrees (more strongly) with the g−2 measurement. But Farrar shows that if there is a long-lived 6-quark state, it can potentially affect the QCD calculation (implicitly) but would be evaded by the branching-ratio measurements (because it would evade all event triggers). Her model requires some good luck with QCD parameters and bound states, but if that luck holds, she can pull dark matter into the standard model and solve some precision-measurement issues! After her talk we discussed a bit about just how hard lattice QCD is. It's absurd!


coordinate freedom

Bernhard Schölkopf (MPI-IS) and I spent time drinking coffee this weekend. Among the many subjects we discussed was language around invariance, equivariance, covariance, coordinate freedom, and symmetry. These words! I have strong opinions, as my loyal reader might know. But during the conversation I had an epiphany in which I understood why Einstein called the symmetry of general relativity “general covariance”: He was probably keying off the mathematicians, who used covariance back then the way we use equivariance now.

I don't like the word “equivariance” at all! Why do we want to write the laws of physics in a rotationally-symmetric (or rotationally equivariant or orientation-free) way? There are two completely different reasons! One is that the laws of physics are observed to be rotationally invariant (which leads to conservation of angular momentum and so on). The other is the theoretical idea that the laws of physics can't depend on investigator choices about coordinates. These are completely different, and the latter is extremely strong. We debated whether there was something to write about all this somewhere.


how to build an astrophysics program?

In group meeting today John Forbes (Flatiron) asked an interesting question: How do you build a good astrophysics program at a small place? He's thinking about this because he is on the job market. My own answer is a bit weird: It is to work with people you trust, since when a place is small, and resources are shared, trust is paramount. But this conversation was interesting, and it was also an illustration of an amusing fact, which is that many of the most interesting discussion topics at the Astronomical Data Group meeting at Flatiron are often not about astronomical data analysis!


causality and time ordering

I had a nice chat with David Blei (Columbia) at the end of the day about the question of whether causal inference (a subject in statistics) can be re-phrased in terms of making predictions about the time-ordering of events. He was not extremely positive about that project! But we talked about the causal-inference approaches. I don't like many of them! Because many of them somehow assume that it is possible to intervene on the situation, and how can you intervene on a unitary system (like, say, the Universe)? Does causality not exist in physics? Does the force cause the acceleration or does the acceleration cause the force? There isn't an answer to that in physics.


writing about radial-velocity precision

Today I went through, with Megan Bedell (Flatiron), the paper we code-name EPRV, which is about the precision with which you can measure a (change in a) radial velocity using spectroscopy. One of the points we discussed is how the results depend on stellar temperature, spectrograph resolution, and wavelength coverage. There is no simple expression of course, because stars vary in such complicated ways with temperature, and the line lists are immense. So we end up having methods that are useful, but not simple back-of-envelope anything. Another point we discussed is that the assumption that the star doesn't vary is a very wrong assumption, and the whole point of the whole literature these days! We care about this point and want to address it in the next work we do here. But how to discuss future directions in present paper? I don't like promising things.

We also discussed our writing styles, which are hella different. I think that's good in a collaboration, of course!


me reading?

As my collaborators and friends know, if there is one thing I hate to do, it is spend all day reading the literature. I love and respect the literature! But don't make me actually read it. But today I sucked it up and read some 20-ish papers about characterizing dark-matter halo shapes, to find out if the coordinate-free shape measurements that Kate Storey-Fisher (NYU) and I are measuring are new. I think they are! In almost every paper I read, the word “shape” translated to eigenvalues of the positional variance tensor, or maybe ratios of those. Am I wrong?


what is a transmission function?

Mike Blanton (NYU) and I agree and disagree on almost everything about fundamental astronomy. As my loyal reader knows, I am writing something on how apparent magnitudes work, and also absolute, bolometric, reddening-corrected, and so on. The apparent magnitude of star depends (among other things) on a filter bandpass or a transmission function. There are two possible definitions of this. One is the fraction of light (as a function of wavelength or frequency) that makes it through the system, from the top of the atmosphere to stimulating the detector. The other is the mean contribution to the total counts read out by the detector of a photon (of a particular wavelength or frequency) impinging on the top of the atmosphere. These might sound similar—they are identical when your detector is a photon counter. But they aren't identical if your detector is, say, a bolometer. I struggled with how to simply communicate all this today.


shapes of dark-matter halos

I had a very very long meeting today with Kate Storey-Fisher (NYU) in which we talked through every aspect of our current project, at every level of abstraction. It was great! And at the end of it, I had a way simpler description of our project than I think I could have articulated even yesterday: We are asking whether high-order, coordinate-free measurements of dark-matter halo shapes can predict galaxy contents.


are data-driven approaches to RV measurement biased?

Matt Daunt (NYU) is re-building our wobble code for measuring stellar radial velocities without any stellar or tellurics model. He is finding that it is slightly biased towards smaller radial velocity amplitudes than what we inject into fake data. This also mirrors things that Bedell (Flatiron) and I have seen in various experiments. I think there is something going on with spectral edges: At the edge of the observed spectral domain, some observations have lines shifted into and out of the observations. The mean spectrum obtained from those measurements isn't necessarily capturing all of this fairly. Or at least it has to be handled carefully. Are we doing this right? Experiments we have suggest that we don't have this quite right yet.


I'm wrong about HORIZONS

The JPL HORIZONS system is amazing! You can compute the position of anything in the Solar System, at any time. With Weichi Yao (NYU) and others, I have been looking at Halley's Comet, with the thought of making a machine-learning benchmark data set (this is an idea from Soledad Villar, JHU). When we look up Halley in HORIZONS, we find many Halleys, not just one. I hypothesized that this is because there are different solutions for Halley on different apparitions. But somehow I am sort-of wrong: That's true for most of the Halleys in the system. But then today in our meeting Yao showed that there's one that seems to do well at all epochs. Huh? Anyway, HORIZONS is better on content than documentation!


Buckingham pi theorem is bad?

The Buckingham Pi theorem is about making physics problems dimensionless. It says that if you have a law of physics that you can manipulate into the form f(inputs) = 0, you can re-write that law with fewer, dimensionless inputs. It's interesting, and important, and it motivated the work that Soledad Villar (JHU) and I are doing on making machine-learning methods obey exact dimensional scalings and unit conversion symmetries.

However, I am not sure the Buckingham Pi theorem works (or is useful) when the function f() is a vector-valued or tensor-valued function with vector-valued and tensor-valued inputs, as it is, say in Maxwell's equations or the equations of general relativity. Villar and I discussed ways to save Buckingham Pi, but I think the main results might either not be correct at all, or not reduce dimensionality. I got upset about it! But it raises an interesting question: Can Buckingham Pi be saved?

My point is: Physics is full of vectors and tensors and the laws are coordinate-free. If going dimensionless is a good idea, then it should be a good idea for vector and tensor expressions that are coordinate-free!


how can linear regression be hard?

Maybe I'm known in astronomy for being both a machine-learning developer and a machine-learning skeptic. I hope so! Anyway, I love linear regression, because it has a lot of the power of bigger ML models, but it's easy to implement and to understand. And yet!

Today Kate Storey-Fisher (NYU) and I looked at her code to predict galaxy properties given dark-matter-halo properties in a set of n-body simulations. We are doing very simple regressions but the condition numbers of the matrices are blowing up and some of our answers don't look great. And this is generic: Many linear-regression models are messed up by condition numbers and numerical linear algebra, and it is hard to diagnose, and it is hard to treat. And if linear regression is hard—and hard for us—why do I believe anything that inovolves 42 layers of fully-connected RELU network?


generative models for quasars

I spent part of the day working with Christina Eilers on her Gaussian process latent-variable model for quasar spectra and physical properties. We re-wrote our title and abstract and went through the math in the paper. It's time to finish this up! We find that we can predict quasar masses with good accuracy (based on held-out data) based on single-epoch, limited-coverage optical spectra. It's sweet. And Eilers has beautiful demonstrations that she can predict unobserved spectral regions, because the model is trained on different quasars at different redshifts with different data. The big problem with this model is that it scales poorly; we can't imagine training on thousands of objects without substantial engineering efforts (and maybe not ever).


MIT visit

I spent the day at MIT today. I learned a huge amount! One highlight was that Rob Simcoe (MIT) showed me the hardware in his lab, and we discussed trade-offs between software and hardware in instrument design. Another was talking to a great group of graduate students over lunch. My talk was about the ontology and epistemology of machine learning. My slides are here. At dinner, Deepto Chakrabarty (MIT) encouraged me to complete my pedagogical note on bolometric magnitudes and so on.