#undislo, day 4: tidal tensor

I worked a little bit today on the question of whether we could possibly measure the local Milky Way gravitational tidal tensor using nearly-comoving stars in ESA Gaia. The idea is that each pair of stars, in some range of separations, should be a tiny two-star tidal stream. The problem looks hard, but there are a lot of pairs!


#undislo, day 3: more comoving stars

After carefully writing down a likelihood function for velocity differences between stellar pairs, Price-Whelan (Flatiron) and I had the Duh realization that the maximum-likelihood velocity difference is just the precise velocity difference you would have computed by plugging the ESA Gaia DR2 catalog entries into the most straightforward formula. Oh well! But of course the likelihood function is still useful, because it can be evaluated at alternative hypotheses (such as: velocity difference vanishes) and it can be integrated in Bayes to make evidences (for what those are worth).

The short-term consequence of all this is that it is extremely easy to make a good comoving-star catalog from Gaia DR2. So easy it is barely a publication! Should we do it anyway?


#undislo, day 1: comoving stars

Yesterday Adrian Price-Whelan told me about an extremely interesting pair of comoving stars that Christina Hedges (NASA) found in a planet search. That had the effect of reminding him that we never ran a comoving-pair search in ESA Gaia DR2 data; we only ever did DR1 (where it was harder!). Add that to the point that Price-Whelan and I have been looking for straightforward, fun inference projects to do during this pandemic. Today I took some time (I am in an undisclosed location for a few days) to write up one (of many) possible methodologies for this project. One thought: Switch from a model-comparison framework (that we used in Oh et al.) to a parameter-estimation framework. This brings lots of advantages in terms of critical assumption-making, and computation. Another thought: Switch from Bayes to frequentism. Before you get too shocked: The DR2 RVS sample is so informative and precise that frequentist and Bayesian parameter estimations will be almost identical.


#sdss2020, day 3+1

If yesterday was day 3, then the one-day working meeting on Monday regarding SDSS-IV was day zero, and today—a working meeting for SDSS-V—was day 3+1. The highlight for me today was a discussion led by Hans-Walter Rix (MPIA) and Kevin Covey (WWU) of what we might do with extra fiber–visits.

SDSS-V is a robot-positioned multi-object fiber-fed spectroscopic survey, with both optical and infrared spectrographs. It works in a few modes, but most of them involve jumping around the sky, taking (relatively) short spectroscopic observations of hundreds of stars at a time. The operations are complex: There are multiple target categories with different cadence requirements, and there are positional constraints on what the fiber robots can do. All this means that there are many, many (like millions of) unassigned fiber–visits.

The range of projects proposed was breathtaking, from Cepheids to microlensing events to nearby galaxy redshifts to quasar catalogs. And all of them so clever and thought-out that they were all compelling. And this wasn't even an official call for proposals: It was just a brainstorming session. Towards the end, there were some ideas (that I loved) about taking a union of the star-oriented suggestions and making a project with excellent data volume and legacy value. I love this Collaboration.

Want a piece of this? Consider buying in. Send me email if you want to discuss that, for yourself or for your institution.


reading a difficult (to me) paper

I participated in day 3 of #sdss2020 today, and even started to pitch a project that could make use of the (literally) millions of unassigned fiber–visits in SDSS-V. Yes, the SDSS-V machines are so high throughput that, even doing multiple, huge surveys, there will be millions of unassigned fiber–visits. My pitch is with Adrian Price-Whelan; it is our project to get a spectrum of every possible “type”of star, where we have a completely algorithmic definition of “type”. More on this tomorrow, I hope.

In the afternoon, I spent time with Soledad Villar (NYU) reading this paper (Hastie et al 2019) on regression. It contains some remarkable results about what they call “risk” (and I call mean squared error) in regression. This paper is one of the key papers analyzing the double descent phenomena I described earlier. The idea is that when the number of free parameters of a regression becomes very nearly equal to the number of data points in the training set, the mean squared error goes completely to heck. This is interesting in its own right—I am learning about the eigenvalue properties of random matrices—but it is also avoidable with regularization. The paper explains both why and how. Villar and I are interested in avoiding it with dimensionality reduction, which is another kind of regularization, in a sense.

Related somehow to all this, I have been reading a new (new to me, anyway) book on writing, aimed at mathematicians. The Hastie et al paper is written by math-y people, and it has some great properties, like giving a clear summary of all of its findings up-front, a section giving the reader intuitions for each of them, and clear and timely reminders of key findings along the way. It's written almost like a white paper. It's refreshing, especially for a non-mathematician reader like me. As you may know, I can't read a paper that begins with the word Let!


life in pandemic

Yesterday and today I had the first face-to-face meetings I have had since early March. Both were in Washington Square Park, and both were with sweaty bike-riding astrophysicists who live in Brooklyn. You can guess who they were! It was a pleasure though. I have missed face-to-face astronomy and teaching. Really. And I don't know how anyone gets anything done during this time. If you are having a hard time: You are not alone.

It was a low-research day as I had Departmental responsibilities. But I did get one single (long) paragraph written in a new plan of research for my work with Eilers (MIT), based on the discussions yesterday of self-calibration of element abundances in red-giant stars, and the inference of birth (as opposed to present-day surface) abundances. One paragraph of writing isn't much. But since a typical scientific paper is only 25 to 26 paragraphs, it is a relevant unit. Good luck yall.


#sdss2020, day 1

I presented my ideas about self-calibration at the SDSS-IV & SDSS-V meeting today. In my talk, Bovy (Toronto) pointed out that there is no way to distinguish stellar-evolutionary effects from surface-gravity-related measurement biases. Good point! But then in the discussion afterwards, Ness (Columbia) pointed out that the causal arguments I am employing can be used to infer true birth abundances or the abundances the star had long ago at its formation. That's potentially incredibly valuable for projects like chemical tagging and related. That's a great insight.

There were many thought-provoking talks today about developing stellar spectral libraries and improving stellar-parameter and abundance inferences, including by Chen (NYUAD), Hill (Portsmouth), Lazarz (Kentucky), Xiang (MPIA), Imig (NMSU), and Wheeler (Columbia). Many of these were driven by the the MaSTAR project to use the BOSS spectrographs to perform stellar populations inferences in nearby galaxies. It's great that the galaxy-evolution and stellar spectroscopy worlds are colliding in the SDSS-IV project. I can't help mentioning that Imig did a great job of summarizing (and using) The Cannon in this context.



I made my slides today for my talk at the SDSS 2020 meeting that is going on this week. I am speaking tomorrow. I had to make slides today because there is a rule of the meeting that you must submit your slides the day before your talk slot. The meeting is a carefully thought-out exercise in remote meetings. I am looking forward to seeing if I can generate some discussion. My talk is about self-calibration: How it was used in the original SDSS imaging, how it has been used in every extremely precise photometric project since, and how it might be used to assist in spectroscopy in the future. I spend some time on the causal attitude towards self-calibration, which my student Dun Wang pioneered. I really should write a paper synthesizing all of this at some point!



I guess I did do some research today: I wrote in my note on uncertainty estimation and I spoke with my student advisees about their projects. But it felt like nothing. The day included heavy stuff on what we are doing wrong as a department around race, and what to do to fix it. These things take lots of time and fill up mental space because they are ethical, important, and upsetting. But black lives matter.


discriminative models are weak

If you have a training set of data X and (real-valued) labels Y, how to learn a model that can predict a new Y given a new X? Soledad Villar (NYU) and I have been working on this for quite some time, and just in the context of linear models and linear operators (that is, no deep learning or anything). We have come up with so many options. You can learn a traditional discriminative model: Do least squares on Y given X. That's standard practice and guaranteed to be optimal in certain (very limited) senses. You can learn a minimal generative model, in which you generate X given Y and then impute a new Y by inference. That's how The Cannon works. You can build a latent-variable model that generates both X and Y with a set of latent variables and then (once again) impute a new Y by inference. And this latter option has many different forms, we realized, even if you stick only to purely linear predictors of X given Y! Or you can build a Gaussian process in which you find the parameters of a Gaussian that can generate both X and Y jointly and then use the Gaussian-process math. (It turns out that this delivers something identical to the discriminative model for some choices.) Or you can do a low-rank approximation to that. OMG so many methods, every one of which is perfectly linear. That is, every one of these options can be used to make a linear operator that takes in a new X and, by a single matrix multiply, produces an estimate of the new Y.

In the last few weeks, we have found that, in many settings, the standard discriminative method—the method that is protected by proofs about optimality—is the worst or near-worst of these options, as assessed by mean-squared error. The proofs are all about variance for unbiased estimators. But in the Real World (tm), where you don't know the truth, you can't know what's unbiased. (Empirically assessing bias is almost impossible when X is high-dimensional, because the relevant bias is conditional on location in the X space.) In the real world, you can only know what performs well on held-out data, and you can only assess that with brute measures like mean-squared error. So we find that the discriminative model—the workhorse of machine learning—should be thought of as a tool of last resort. In this linear world.

Will these results generalize to deep-learning contexts? I bet they will.


Gaia and EPRV

Megan Bedell (Flatiron) and I discussed some possible projects that connect ESA Gaia astrometric exoplanet information with ground-based spectroscopic radial-velocity information to discover sub-threshold planets that aren't detectable in either data set individually. We are planning possible undergraduate research projects for summer students. And we are thinking about the conditions under which a planet can be said to be discovered or confirmed.


systematics in element abundance measurements

Today Christina Eilers (MIT) and I looked at plots she made of element abundances in red-giant-branch stars as measured in the APOGEE data. She plotted abundances as a function of position in the Galaxy and surface gravity. Our expectation is that the abundance distribution should be a strong function of galactocentric radius and height above the disk plane, but a very weak (or null) function of stellar surface gravity: After all, stars at different surface gravities (and we are only looking at the stars above the red clump) are just at marginally different evolutionary stages; they shouldn't (on average) be very different in age or formation. This is a baby step towards building an empirical model of element-abundance measurement systematics, to self-calibrate a better set of abundances. We discussed details of visualization and communication of the results, for presentation to the APOGEE team for discussion.


social-science research

In the weekly Astronomical Data Group meeting, some of the questions that came up were about social-science research: Can we bring our knowledge of data and statistics—knowledge gained while working on astronomical problems—to bear on important social-science results on race, policing, hiring, and so on. We discussed some specific results in social-science research, and we discussed differences between the kinds of questions asked in social sciences and those asked in the natural sciences. There are lots of commonalities, of course! But two important differences between social-science researcch and much of the current work in the Group right now are: Social scientists are much more interested in establishing causal relationships between factors, and less about measuring parameters. And many studies are exploratory. The investigators have an interesting survey and are looking for good results within it.

These two properties (exploratory and causal) are shared with many studies in particular astronomy areas, such as galaxy evolution (does a change of environment cause a galaxy to quench its star formation, say?) and exoplanet abundances (do stars with higher metallicity form more planets, say?). Because there are sub-areas of astronomy that look a lot like social-science in their research style, I came away from these discussions thinking that interactions between astronomers and social scientists could be very valuable for the astronomers, at least. (And I have had some valuable interactions in the past.)


Not much; what constitutes a discovery?

Today I spent a lot of time on things that started yesterday when I was on strike, which includes revising how we run certain meetings at Flatiron, and establishing a committee to do work against racism in the Department of Physics at NYU. My only traditional research to speak of today was a discussion with Megan Bedell (Flatiron) about whether or how we should summarize and write up what she has learned about the ways people have claimed detections or discoveries in exoplanet searches. It's a huge, huge literature, bigger than I expected. And very diverse. One idea I like is to figure out how the different methods for assessing and claiming discoveries are related and connected.


strike for black lives

I did no research today, because I was on strike, as part of the international strike for black lives. It was an amazing and exhausting day. I hope we successfully make real changes.


bias–variance trade-off in histograms

Kate Storey-Fisher (NYU) and I discussed various things, but one of the important points of her project relates to the properties of histograms. My loyal reader knows that she is replacing the usual two-point correlation function estimator for large-scale structure surveys with one that does not require that the galaxy pairs be put into bins by radial separation. One of the main points of the method is that it permits continuous-function bases that are better at expressing expected correlation functions than the usual binned estimators, which are like histograms. We are trying to visualize these points in the paper, showing that as you go to larger bins, the usual binned estimator becomes more precise (lower variance) but less accurate (more bias) and vice versa. I also (a week or so ago) demonstrated this for just normal histograms in this notebook.



No research today, except for a valuable call with Christina Eilers (MIT) about visualizing systematic (unphysical) effects in APOGEE abundance measurements. The rest of the day was spent taking care of extended family.


giving old spectrographs new capabilities

As my loyal reader knows, Lily Zhao (Yale) has been working with Megan Bedell (Flatiron) and me on how we can parameterize and measure the physical state of a spectrograph, for wavelength-calibration purposes. What she's shown is that the spectrograph is indeed a low-dimensional object, in the sense that the wavelength solutions you find fall in a tiny subspace of all possible wavelength solutions (not surprising). Her calibration method Excalibur (which we are writing up now) is built on these results.

As I mentioned a few weeks ago, if the spectrograph state is low-dimensional, in general any change to the state will appear not just in the wavelength solution but also in anything else we can precisely measure about the device. And we realized that we can precisely measure the locations of the spectral traces in the spatial (as opposed to spectral) direcction. Today Zhao showed very good evidence that such measurements are extremely precise measures (in the EXPRES spectrograph) of the calibration state of the device. Our next step is to explicitly upgrade these to an effective “simultaneous reference” that effectively keeps track of wavelength-solution variations throughout the observing night, making every exposure more precisely calibrated, and reducing the demands for calibration data. If this works, we have potentially provided a simultaneous reference for many spectrographs that don't have one, and made many spectrographs both more precise and more efficient.


reverse-engineering the temperature derivative

I spent time on the phone with Teresa Huang (NYU) and Soledad Villar (NYU) trying to work out whether we have a good estimate of the derivative of the temperature with respect to spectral shape for stars in SDSS-IV APOGEE data. What we did is we took a ball of synthetic spectra (theory-generated spectra) in the vicinity of the target spectrum, and then regressed against temperature to figure out how the theoretical best-fit temperature ought to depend on spectral shape or spectral features. That's good, but now how do we validate that we got the right answer? We are trying to avoid learning any actual stellar physics here.


how fast do things move on the red-giant branch?

Adrian Price-Whelan (Flatiron) and I—after spending lots of time talking about police violence against black people—wrote down some considerations for a possible model of binary companions on the red-giant branch. We have found that the binary fraction (and binary period distribution) is a function of position on the H–R diagram, as expected if stars engulf their close-in companions as they evolve. We think we might be able to constrain some hard-to-measure things about stellar evolution using the phenomenology here.

My view about data analyses (as my loyal reader knows) is that once you write down your assumptions in sufficient detail, the data-analysis method flows directly from those assumptions. So we wrote down assumptions! I think we have enough to make a method and make this work. One thing we discussed is that the assumptions are strong, but they are also testable. So even the assumptions themselves can become interesting parts of the project.

My day also had great conversations with Ana Bonaca (Harvard) about forward-modeling asteroseismic signals and with Megan Bedell (Flatiron) about designing surveys to detect particular kinds of exoplanets.



Today was a lost day. Lost to rage about police violence against black people. Black lives matter.


uncertainty estimates

I'm trying to write something about how you estimate the uncertainty on any measurement you make. It's slow going. I had a useful conversation today with Christina Eilers (MIT), including a bit about over- and under-estimated uncertainties. Late in the day I went out to join the protests because Black Lives Matter.