Terra Hunting Fall Science Meeting, day 4

Today we delved into even more detail about how the HARPS3 instrument works, looking at engineering drawings and discussing how charge-coupled devices (CCDs) read out. We discussed the time stability of various parts of the instrument and electronics. We are all very excited about assembly, verification, and testing in Cambridge this summer.


Terra Hunting Fall Science Meeting, day 3

Today was a delight! In a working session, Clark Baker (Cambridge) gave a beautiful, conceptual and concrete description of how an echelle spectrograph works and the blaze and the resolution and etc. My favorite moment was the aha! moment I had when he described the Littrow condition. This was followed by Alicia Anderson (Cambridge) explaining how the data reduction proceeds. Then she and Federica Rescigno (Exeter) helped us install the data-reduction software for the ESO instruments (ESPRESSO, HARPS-N, etc) and we started reducing raw echelle data.

Before all this there was a wide-ranging discussion of measuring 3-point functions of radial-velocity time series data. This was inpired by the question: Is a Gaussian process a good model for these data? I hope this turns into a project or set of projects.


Terra Hunting Fall Science Meeting, day 2

So many good things happened in the meeting today! Highlights were presentations by Niamh O'Sullivan (Oxford) Ben Lakeland (Exeter) who showed amazing results running models of stellar variability on data from the Sun. O'Sullivan can see that the sun goes through many different phases of spots, granulation, and super-granulation. She finds these by fitting Gaussian processes of certain forms. Related: Suzanne Aigrain (Oxford) showed that even in very gappy data, the GP fits are unbiased, whereas naive use of periodograms is biased!

Lakeland showed that super-granulation can in principle be modeled in the Solar time series, and maybe the tiniest hint that when he corrects for super-granulation well, the RV variability might be even lower than at times at which there is no super-granulation in play at all. Does super-granulation suppress other kinds of variability?

I'm very optimistic—between Liang yesterday, Zhou's work at Flatiron, and these presentations—that we will be able to mitigate many difficult sources of stellar variability. I was inspired to outline a conceptual paper on why or how this is all going to work.


Terra Hunting Fall Science Meeting, day 1

Today was the first day of the Terra Hunting annual science meeting. One highlight of the day was a presentation by Yan Liang (Princeton), who is modeling stellar spectral variability (the tiny variability) that affects extremely precise radial-velocity measurements. Her method involves a neural network, which is trained to distinguish RV variations and spectral shape variations through a self-supervised approach (with a data augmentation). Then it separates true stellar RV variations from spectral-variability-induced wrong RV variations by requiring (essentially) that the RV variations be uncorrelated with the (latent) description of the stellar spectral shape. This connects to various themes I am interested in, including wobble by Bedell, a spectral variability project by Zhao, and causal structure in machine learning.


double periodogram

Cole Johnston (Leuven) is in New York this week. We discussed the problem of finding oscillation modes in the photometry of stars in the presence of a large, binary-induced periodicity. What he kind-of wants is a simultaneous fitting of a flexible periodic function plus a periodogram. We did some experiments (very promising!) and discussed the elements that will come together to make this all happen. The final method will look like a double fourier transform, in which one frequency grid gets the periodic part, and the other grid gets the rest of the modes and noise.


grant proposals

There is a non-wrong view of academic science that it is all about applying for funding, and evaluating the proposals of others for funding. That's all I did today (evaluated proposals for a foreign funding program; I submitted my own proposal to the NSF yesterday).


postdoc applications

There is a non-wrong view of the academic enterprise that it is entirely about getting hired, evaluating people for hire, and hiring. That's all I did today (okay the latter two, not the first).


conjectures about pre-training

On Monday of this week, Shirley Ho (Flatiron) gave a talk at NYU in which she mentioned the unreasonable effectiveness of pre-training a neural network: If, before you train your network on your real (expensive, small) training data, you train it on a lot of (cheap, approximate) pre-training data, you get better overall performance. Why? Ho discussed this in the context of PDE emulation: She pre-trains with cheap PDEs and then trains on expensive PDEs and she gets way better performance than she does if she just trains on the expsensive stuff.

Why does this work? One interesting observation is that even pre-training on cat videos helps with the final training! Ho's belief is that the pre-training gets the network understanding time continuity and other smoothness kinds of things. My conjecture is that the pre-training teaches the network about (approximate) diffeomorphism invariance (coordinate freedom). The cool thing is that these conjectures could be tested with interventions!


radical papers I want to write (or will never write)

I have to finish my NSF proposal with Mike Blanton (NYU), so naturally I am in procrastination mode. Here are three papers I wish I would write. Maybe I should post them on my ideas blog:

Occam's Razor is wrong: This paper, co-authored with Jennifer Hill (NYU), would be about the fact that, in the real, observed world, the simplest explanation is always wrong or at least incomplete.

Causation is just causality: This paper, maybe co-authored with David Blei (Columbia) or Bernhard Schölkopf (MPI-IS) or Hill, shows that you don't need to have free will in order to have cogent causal explanations of data. That is, you don't need to phrase causality in terms of predictions for counter-factual experiments that you might have chosen to do.

You don't ever want evidence: This paper shows that any time you are computing the Bayesian evidence—what I call the fully marginalized likelihood (fml)—you are doing the wrong integral and solving the wrong problem. For both practical and theoretical (principled) reasons.


data augmentation

A highlight of my day was a colloquium by Renée Hložek (Toronto) about cosmology and event detection with the LSST/Rubin. Importantly (from my perspective), she has run a set of challenges for classifying transients, based on simulations of the output of the very very loud LSST event-detection systems. The results are a bit depressing, I think (sorry Renée!), because (as she emphasized), all the successful methods (and none were exceedingly successful) made heavy use of data augmentation: They noisified things, artificially redshifted things, dropped data points from things, and so on. That's a good idea, but it shows that machine-learning methods at the present day can't easily (or ever?) be told what to expect as an event redshifts or gets fainter or happens on a different night. I'd love to fix those problems. You can almost think of all of these things as group operations. They are groups acting in a latent space though, not in the data space. Hard problems! But worthwhile.


writing proposal

Mike Blanton (NYU) and I are writing an NSF proposal. That took up most of my research time today!


linear regression

Valentina Tardugno (NYU) and I are looking at the NASA TESS housekeeping data: What parts of it are relevant to understanding the light curves? The weird thing is: We are asking this by asking: What housekeeping data can be reliably predicted using the light curves? Why this way? Because the light curves are higher in signal-to-noise (in general) than most channels of the housekeeping data. Today we went through all the relevant linear algebra for big linear models (which is where we are starting, of course!).


abundance gradients wrt positions or actions

It is traditional to plot things like the mean iron abundances of stars (or ratios of magnesium to iron, or other ratios) as a function of position in the Galaxy. However, stars change their positions over time, so the gradients (the features in any abundance–position plots) will be smeared out over cosmic time by their motions.

At the same time, stars have approximately invariant actions or integrals of motion, which don't change (much) as they orbit. These invariants are only approximate, both because the Galaxy isn't exactly integrable, and also because we don't know or measure everything we need to compute them precisely for any observed star.

Putting these two ideas together, the abundance–action features, or really abundance–invariant features should be much clearer and more informative than the abundance–position features. Awesome, let's go! The only problem is: Selection effects are often simple in the position space, but are almost never simple in the dynamical-invariant-space. So any plots are harder to interpret generally.

These are issues that I have discussed over many years with Hans-Walter Rix (MPIA). Today I discussed them with Danny Horta (Flatiron) and Adrian Price-Whelan (Flatiron), in preparation for an exploratory study by Horta.


predicting spectra from spectra

Saakshi More (NYUAD) came into my office during office hours today to ask about possible data science projects in physics. I pitched to her predicting ESA Gaia RVS spectra from Gaia XP spectra, and vice versa. Has anyone done that? In one direction, you have to predict high resolution detail from low-resolution input; in the other direction, you have to predict a wide wavelength range from narrow input. It seems like perfect for something like a linear auto-encoder (at least for a small patch of the color–magnitude diagram; non-linear for a large patch). Later in the day I talked to Gaby Contardo and she said: If you want to go simple, how about nearest neighbor? Good idea!


unitary evolution of the Universe

I spent the day with Juna Kollmeier (CITA) talking about epistemology, physical cosmology, and project management (especially academic management). I found myself saying to her the following argument (which I have not seen written down anywhere): Imagine that our Universe is hamiltonian (or lagrangian; it doesn't matter for these purposes). And imagine that our Universe is a simulation being run inside some bigger universe, which is also hamiltonian.

If our Universe is being observed in any sense by any system in that bigger universe, then there ought to be a loss of unitarity in our Universe. That is, there should be a violation of Liouville's theorem, or a violation of key conservation laws, or an information sink. And there is! At black hole horizons, there is an information paradox: Information that goes in never comes back (an evaporating black hole evaporates thermally, or so we think). Thoughts?


ARC BOG meeting

I stayed on at Cloudcroft after the SDSS-V Advisory Council meeting for the ARC Board of Governors meeting, which is the meeting of the organization that runs the Apache Point Observatory. I spent a lot of the meeting learning about the 3.5m and the site, which was interesting, and which made me think about how we apportion our resources in astronomy. These are huge facilities, run very lean (money wise), and they produce a lot of science. The SDSS family of projects has had simply immense scientific impact.

One success of the meeting: I have successfully coined and propagated the term SDSS Classic to mean SDSS-I and SDSS-II. Multiple people at the meeting now use this terminology!


SDSS-V AC meeting

Today I chaired the annual Advisory Council (AC) meeting for the SDSS-V project. The AC protects the interests of the partners, who gave money and other resources to the project. We had many presentations from different parts of the project and OMG this project is amazing. I learned a ton and feel very happy that our money is well spent. This activity counts as research because project management is a key part of science.

The AC meeting was followed by touring the observing hardware (I love it; the SDSS Telescope is incredibly important to everything I have done since the late 1990s), followed by actually looking through the 3.5m telescope at Apache Point Observatory.


toning down my language

I spent travel time (at airports and on airplanes) working on the title, abstract, and introduction of the forthcoming paper with Andy Casey (Monash) about combining visit spectra into mean spectra. This was mainly about me changing the tone from “You are all doing it wrong!” to a tone more like “Here's a way to think about it, and the consequences thereof.” After all, no method is the best for all situations and cases. Our method is best for situations where the individual visit spectra are barely sampled or under-sampled.


M dwarfs

I had a great phone call with Madyson Barber (UNC) and Andrew Mann (UNC) today about M dwarf stellar spectroscopy. I love the problem of understanding the spectra of M dwarfs because this is a subject where there is no ground truth: No physical models of M dwarf photospheres work very well! Why not? Probably because they depend on lots of molecular transitions and band heads, the properties of which are not known (and very sensitive to conditions).

I love problems where there is no ground truth! After all, science as a whole has no ground truth! So the M-dwarf spectroscopy problem is a microcosm of all of science. I went off the deep end on this call, and we were all left knowing less than we knew when we started the call. By this post, I apologize to Barber and Mann.