my definition of an adversarial attack

Based on conversations with Soledad Villar, Teresa Huang, Zach Martin, Greg Scanlon, and Eva Wang (all NYU), I worked today on establishing criteria for a successful adversarial attack against a regression in the natural sciences (like astronomy). The idea is you add a small, irrelevant amount u to your data x and it changes the labels y by an unexpectedly large amount. Or, to be more specific:

  • The L2 norm (u.u) of the vector u should be equal to a small number Q
  • The vector u should be orthogonal to your expectation v of the gradient of the function dy/dx
  • The change in the inferred labels at x+u relative to x should be much larger than you would get for the same-length move in the v direction!
The first criterion is that the change is small. The second is that it is irrelevant. The third is that it produces a big change in the regression's output. One issue is that you can only execute this when you have v, or an expectation for dy/dx independent of your regression model. That's true in some contexts (like spectroscopic parameter estimation) but not others.


a tiny bit of progress

Conversations with Soledad Villar (NYU) got me closer to understanding what's different between an adversarial attack against a classification and a regression.


automatically tuning a radio

As my loyal reader knows, Abbie Shaum (NYU) and I are building a software analog of an FM radio to find binary companions to stars with coherent oscillation modes. We have a good signal! But now we want to optimize the carrier frequency and some other properties of the radio. We discussed how to do that today.


products of Gaussians, Joker

As my loyal reader knows, I love that products of Gaussians are themselves Gaussians! The result is that there are many factorizations of a Gaussian into many different Gaussian products. As my loyal reader also knows, Adrian Price-Whelan (Flatiron) and I found a bug in our code The Joker which fits radial-velocity data with Keplerian orbital models; this bug is related to the fundamental factorization of Gaussians that underlies the method. Today Price-Whelan showed me results from the fixed code, and we discussed them (and the priors we are using in our marginalization), along with the paper we are writing about the factorization. Yes, people, this is my MO: When you have a big bug—or really a big think-o or conceptual error—don't just fix it, write a paper about it! That's the origin of my paper on the K-correction. We are also contemplating writing a note about how you can constrain time-domain signals with periods longer than the interval over which you are observing them!


disk and black holes

Today was the last day of the visit by Christina Eilers (MIT). We decided that we have a clear scope for our Milky Way disk paper, and we have a full outline and figures, so it was a success. But we don't have a good picture of our black-hole and quasar projects, which include finding a data-driven model that generates quasar properties given a black-hole mass (and possibly many latent parameters). The data are messy! And we don't know what to believe. To be continued.


dynamical spiral structure in the disk

Today Christina Eilers (MIT) and I got in a call with Hans-Walter Rix (MPIA) to discuss our Milky-Way disk project. He opined that our measurement of the spiral structure is indeed both the clearest picture of it in the $X-Y$ plane of the Milky Way disk, and also the only measurement of its dynamical influence or amplitude. So that was a good boost for morale, and we promised to send him a full outline of our paper, with figures and captions and an abstract, next week.


hierarchical, non-parametric calibration

It being Wednesday, I worked today with many people. Many highlights! One was the following: With Lily Zhao (Yale), my loyal reader knows, I am calibrating the EXPRES spectrograph, which has both laser-frequency comb and thorium-argon lamp calibration data. Zhao and I have figured out that we can go both hierarchical and non-parametric with the calibration: Hierarchical in the sense that we will use all the calibration frames to calibrate every exposure, and non-parametric in the sense that we won't choose the order of a polynomial, we will use interpolation or a process.

Today we improved the order of operations for this project. At first we were interpolating, and then building the hierarchical model. But today (forced by computational cost) we realized that we can build the hierarchical model on the calibration data prior to the interpolation. That's lower dimensional. It sped things up a lot, and simplified the code. We did some robust things to deal with missing data, and Zhao did some clever things to make her code work with the arc lamp just as well as it does with the laser-frequency comb.

Our current plan is to assess our calibration quality by looking at the measured radial velocity (which should be exactly zero, I hope) of the thorium-argon lamps as calibrated by the LFC, and look at the velocity of the LFC as calibrated by the ThAr lamps. That is, a cross-validation between the lamps and the comb.


What's an adversarial attack against a regression?

I had a very brief but useful conversation today with Soledad Villar (NYU) about the strategy and meaning of adversarial attacks against regression methods. We have been working on this all semester, but I am still thinking about the fundamentals. One thing I am confident about, even in the trivial machine-learning methods I have used in astronomy, is that there will be successful single-pixel attacks against standard regressions that we use. That is, you will find that the ML method is very sensitive to particular pixels! But this is a conjecture. We need to make a very clear definition of what constitutes a successful attack against a regression. In the case of classification, it seems like the definition is “The authors of the method are embarrassed”. But that doesn't seem like a good definition! Aren't we scientists? And open scientists, at that.



Christina Eilers (MIT) is in town for the week. We realized that we have four projects: Work on the kinematic signatures of spiral arms in the Milky Way disk; design a self-calibration program for stellar element abundances; create a latent-variable model (like The Cannon) for estimating black-hole masses from quasar spectra; infer simultaneously the large-scale structure towards luminous quasars and the quasar lifetimes using rest-frame ultraviolet spectra.

Because it is the most mature, our highest priority is the disk paper. We discussed the scope of this paper, which is: Good visualization of the velocity structure; a toy model to relate the velocity amplitude with the density amplitude of any dynamically-driven perturbation; rough measurement of the pitch angle; comparison to other claims of spiral structure in the neighborhood. We think we have the clearest view of the spiral structure, and the only truly dynamical measurement.


black physicists

I spent the last two days at the National Society of Black Physicists meeting in Providence RI. It was a great meeting, with a solid mix of traditional physics, strategizing about the state of the profession, and offline conversations about politics and the many communities of physicists. Many great things happened. Here are some random highlights: I learned from Bryen Irving (Stanford) that the harder neutron-star equations of state lead to larger tidal effects on binary inspiral. After all, harder state means larger radius, larger radius means more tidal distortion to the surface equipotential. Deep! I enjoyed very much a comment by Richard Anantua (Harvard) about “the importance of late-time effects on one's career”. He was talking about the point that there are combinatorially many ways to get from point A to point B in your career, and it is your current state that matters most. Beautiful! There was an excellent talk by Joseph Riboudo (Providence College) that was simultaneously about how to influence the community with a Decadal-survey white paper and about primarily undergraduate institutions and how we should be serving them as a community. He was filled with wisdom! And learning. Eileen Gonzalez (CUNY) showed her nice results understanding incredibly cool (and yes, I mean low-temperature) star binaries. She is finding that data-driven atmospheric retrieval methods plus clouds work better than grids of ab initio models. That's important for the JWST era. And I absolutely loved off-session chatting with Dara Norman (NOAO) and others. Norman is filled with conspiracy theories and I have to tell you something: They are all True. Norman also deserves my thanks for organizing much of the astrophysics content at the meeting. It was a great couple of days.


not a thing

All NSF proposals, all day! That doesn't count as research (see Rules). Maybe I should change the rules?


exciting stars

Stars and Exoplanets Meeting at Flatiron was a delight today. Lachlan Lancaster (Princeton) showed his results on a really interesting object he found in the ESA Gaia data. He was inspired by the idea that star clusters might have central black holes, which might retain a very dense, very luminous nuclear star cluster even after the cluster disrupts. But his search of the Gaia data was so simple: Look for things that are apparently bright but low in parallax (large in distance). Duh! And what he found is a very bright “star” that is variable, shows emission lines, and is above the top of the H–R diagram! The ideas from the room ranged from extremely young star to microquasar to technosignatures (who suggested that?). And the thing is incredibly variable.

But there was lots more! I won't do everything, but I will say that Thankful Cromartie (Virginia) showed data from pulsar monitoring (as part of a pulsar-timing project for gravitational waves). She showed that she can very clearly see the Shapiro time delay in the pulses when they pass by the neutron star that is in orbit around the pulsar. This lets them measure the mass of the neutron star accurately. It is very massive! i think it must be one of the most massive neutron stars known, which, in turn, will put pressure on the equations of state. Beautiful results from beautiful data.



Jim Peebles

An all-proposal day was interrupted by a three-hour lunch with Jim Peebles (Princeton) and Kate Storey-Fisher (NYU). We discussed many things, but a major theme was curiosity-driven research, which Peebles wants to speak about at the Nobel Prize ceremony next month.


inferring star-formation histories

Grace Telford (Rutgers) showed up in NYC today and we discussed the inference of star-formation histories from observations of resolved stellar populations. We discussed the point that the space being high dimensional (because, say, the star formation history is modeled as a set of 30-ish star-formation rates in bins), which leads to two problems. The first is that a maximum-likelihood or maximum-a-posteriori setting of the SFH will be atypical (in high dimensions, optima are atypical relative to one-sigma-ish parameter settings). The second is that the results are generally extremely prior-dependent, and the priors are usually made up by investigators, not any attempt to represent their actual beliefs. We talked about ways to mitigate against these issues.


interpolation duh

As my loyal reader knows, I am working with Lily Zhao (Yale) to calibrate the EXPRES spectrograph. Our approach is non-parametric: We can beat any polynomial calibration with an interpolation (we are using splines, but one could also use a Gaussian Process or any other method, I think). The funniest thing happened today, which surprised me, but shouldn't have! When Zhao plotted a histogram of the differences between our predicted line locations (from our interpolation) and the observed line locations (of held-out lines, held out from the interpolation), they were always redshifted! There was a systematic bias everywhere. We did all sorts of experiments but could find no bug. What gives? And then we had a realization which is pretty much Duh:

If you are doing linear interpolation (and we were at this point), and if your function is monotonically varying, and if your function's first derivative is also monotonically varying, the linear interpolator will always be biased to the same side! Hahaha. We switched to a cubic spline and everything went unbiased.

In detail, of course, interpolation will always be biased. After all, it does not represent your beliefs about how the data are generated, and it certainly does not represent the truth about how your data were generated. So it is always biased. It's just that once we go to a cubic spline, that bias is way below our precision and accuracy (under cross-validation). At least for now.


disk–halo (and halo–halo) mean-velocity differences

I had a meeting with Emily Cunningham (Flatiron) to discuss any projects of mutual interest. She has been looking at simulations of the Milky Way (toy simulations) in which the LMC and SMC fall in. These simulations get tidally distorted by the infall, and various observational consequences follow. For example, the disk ends up having a different mean velocity than the halo! And for another, different parts of the halo move relative to one another, in the mean. Cunningham's past work has been on the velocity variance; now it looks like she has a project on the velocity mean! The predictions are coming from toy simulations (from the Arizona group) but I'm interested in the more general question of what can be learned from spatial variations in the mean velocity in the halo. It might put strong constraints on the recent-past time-dependence.



With teaching, NSF proposals, and job season, I didn't get anything done, research-wise. The closest thing was some twitter (tm) conversations about space debris and phase demodulation.


phase demodulators to find planets

Oh what a great day! Not a lot of research got done; NSF proposals, letters of recommendation, and all that. But in the afternoon, undergraduate researcher Abby Shaum (NYU) and I looked at her project to do frequency demodulation on asteroseismic modes to find orbital companions and we got one. Our target is a hot star that has a few very strong asteroseismic modes (around 14 cycles per day in frequency), and our demodulator is actually a phase demodulator (not frequency) but it's so beautiful:

The idea of the demodulator is that you mix (product) the signal (which, in this case, is bandpass-filtered NASA Kepler photometric data) with a complex sinusoid at (as precisely as you can set it) the asteroseismic carrier frequency. Then you Gaussian smooth the real and imaginary parts of that product over some window timescale (the inverse bandwidth, if you will). The resulting extremely tiny phase variations (yes these stars are coherent over years) have some periodogram or power spectrum, which shows periodicity at around 9 days, which is exactly the binary period we expected to find (from prior work).

I'm stoked! the advantages of our method over previous work are: Our method can easily combine information from many modes. Our method can be tuned to any modes that are in any data. We did not have to bin the lightcurve into bins; we only had to choose an effective bandwidth. The disadvantages are: We don't have a probabilistic model! We just have a procedure. But it's so simple and beautiful. I'm feeling like the engineer I was born to be.


calibrating a stable spectrograph with a laser frequency comb

It was a great research day today. I worked with Lily Zhao (Yale) on the wavelength calibration of the EXPRES spectrograph, which my loyal reader knows is a project of Debra Fischer (Yale). Lily and I cleaned up and sped up (by a lot) the polynomial fitting that the EXPRES team is doing, and showed (with a kind of cross-validation) that the best polynomial order for the fit is in the range 8 to 9. This is for a high-resolution, laser-frequency-comb-calibrated, temperature-controlled, bench-mounted, dual-fiber spectrograph.

But then we threw out that polynomial fit and just worked on interpolating the laser frequency-comb line positions. These are fixed in true wavelength and dense on the detector (for many orders, anyway). Oh my goodness did it work! When we switched from polynomial fitting to interpolation, the cross-validation tests got much better, and the residuals went from being very structured and repeatable to looking like white noise. When we averaged solutions, we got very good results, and when we did a PCA of the differences away from the mean solution, it looks like the variations are dominated by a single variability dimension! So it looks like we are going to end up with a very very low-dimensional, data-driven, non-parametric calibration system that hierarchically pools information from all the calibration data to calibrate every single exposure. I couldn't be more stoked!