In a (nearly) no-research day, we realized that Gaia DR2 will tell us more about ourselves than it will about the Milky Way. And it will tell us a lot about the Milky Way! Can't wait.
It was Fisher-information time today with Bedell (Flatiron), where we looked at whether or not spectro-perfectionism can deliver full radial-velocity precision from the 2-d spectrograph data to its 1-d extraction. The answer is unclear. There are two steps to s-p: The first is a least-square fit, which is definitely information-preserving, but the second is a smoothing back to natural resolution, which might be lossy. We are still working on it. My linear algebra is pushed to its limits.
Ruth Murray-Clay (UCSC) was in town, and we discussed exoplanets at lunch, and she gave a great talk late in the afternoon. At lunch, a highlight was discussing how we might update the expectations for exoplanet discoveries in the Gaia Mission; the papers on this are now way out of date (I think?). In her seminar, a highlight was a very simple, high-level picture of what current theory says about exoplanet formation, and some very simple ideas about critically testing this high-level model.
Today Megan Bedell (Flatiron) and I called Julian Stuermer (Chicago) and Ben Montet (Chicago) to talk a bit about spectrographic measurements of radial velocity. We are looking at different extraction methods and how much information they sacrifice: Is it better to be measuring radial-velocity in the two-d image plane of the spectrograph rather than in the one-d extracted spectrum? This is not yet clear, but we have formulated the question in an information-theoretic framework.
This all relates to ancient conversations I had with Sam Roweis about spectroscopy and spectro-perfectionism (s-p). There are so many questions! In s-p, the assumption is that your spectrograph is perfectly calibrated in every way; in this case, how do you extract all the information? But the real world isn't so great, so there might be lots of experiments to do in different regimes of realism, either about the spectrograph, or about the noise, or about calibration imperfections. I promised the crew that I would figure out the Cramér–Rao bound on radial-velocity in the two-d spectrograph image and in the s-p extraction under perfect conditions.
Group meetings were fun today. In Gaia DR2 prep meeting, I worked with Megan Bedell (Flatiron) to get some plots ready for Gaia DR2. That is, we planned what we will plot the moment that the data release happens. The goal is to look at physical and kinematic properties of exoplanet host stars.
In Stars meeting, JJ Hermes (UNC) showed some incredible WD lightcurves, which appear to come from white dwarfs that have quadrupolar temperature distortions on their surfaces, rotating. There appears to be a common sub-type of white dwarfs that show evidence of magnetism and strong surface temperature variations. We discussed things to do with Gaia and other data sources.
John Brewer (Yale) showed hot-off-the-presses results on chemical-abundance variations within Praesepe. This cluster has some really strange properties, like an amazingly low velocity dispersion. But he finds a chemical-abundance variation in iron but also elements ratioed to iron. This is in qualitative disagreement with work I have done with Melissa Ness (Columbia), so there is something to work out there. We discussed critical tests of his methods and results.
Today Kathryn Johnston (Columbia) organized a few-hour meeting at Flatiron to discuss kinematic or dynamical models of the Milky Way that would have far more flexibility than the models we have used up to now. That is, employing function expansions or highly parameterized models of perturbations away from the toy models that are currently used in Galaxy dynamics at the present day. Part of the discussion was about expansions that help with making simulations more accurate, but some (and the part I cared about) was about making data analyses better.
Many good ideas came up for near-term projects, for instance: One was refinement of an idea with Chervin Laporte (UVic) to use his disk simulations to make empirical basis functions from simulation snapshots that would permit us to make flexible but interpretable models of the disk in the Gaia data. Connected to this, the idea of making such basis functions not in 3-d density or potential space but in 6-d phase-space-density space. That could be valuable both for data analysis with Gaia and for theory. Indeed, the things that Martin Weinberg (Amherst) has been thinking about in basis functions might be expandable to 6-d.
There was much discussion about how such basis function expansions might make data or theory descriptions of the Milky Way (or simulations thereof) compact. This is a dimensionality reduction point and issue. There was more-or-less consensus that we should only be thinking about linear dimensionality reduction (which is good, because it can often be made into a convex optimization problem) but non-linear generalizations could be worth thinking about.
In some ways, the most impressive aspect of the day was the community-building activity. Johnston got together groups of people that have not usually collaborated and set up the conditions under which they might actually collaborate. She is not just an extremely insightful and accomplished physicist: She is really thinking about improving the long-term health of the fields in which she works.
There is a Monday seminar at Princeton run by the astrophysics graduate students that focuses on useful skills and knowledge around research, rather than research results. That's a good idea!
I gave the seminar today; I spoke about machine learning in astronomy. I started with my ML taxonomy and my recommendation to understand five beautiful, simple, and instructive examples: SVM, linear regression, PCA, k-means, and GMM with the EM algorithm. How's that for acronyms! I think each of these five methods is so beautiful, everyone should know how each of them works and generalizes.
Each of these methods is in a different taxonomic category (in order: classification, regression, dimensionality reduction, clustering, and density estimation), and each is beautiful. The first three are linear and convex, and each (for related reasons) can be generalized with the kernel trick. In the second half of my talk I discussed this, but my explanation went off the rails. I think I left everyone confused. Time to do more homework.
The day started with a discussion or break-out about making a latent-variable structure for the incredible result by Guy Davies (Birmingham) that the power-spectra of red-giants in an open cluster lie on a one-dimensional locus. Details include: He is only looking at the overall envelope of the power spectrum, parameterized by 8-ish parameters. His 8-ish parameters follow a one-dimensional locus of power laws with respect to each other, except one. That one is the white-noise level, which makes sense is different. So he has a two-dimensional model that seems to fit extremely well every single star power spectrum in an open cluster observed by Kepler!
This discussion merged into a longer discussion, code-named Light-Curve Cannon with contributions from many people looking at how time-domain behavior of stars on different time scales can be used to predict or infer stellar parameters. It is extremely promising that TESS-like time-domain data will be able to tell you stellar parameters at comparable precision to contemporary spectroscopic modeling! Ruth Angus (Columbia) did a great job of bringing together the threads in these discussions: There are many papers to write.
The day ended with a wrap-up in which everyone contributed one slide and spoke for less than two minutes. Here are the wrap-up slides. They only give you the tiniest hint at all the things that happened this week!
Thank you to Dan Foreman-Mackey (Flatiron) and the Flatiron CCA staff and the Simons Foundation events staff for an absolutely great meeting. In particular, Foreman-Mackey's vision, leadership, technical abilities, and good nature got everyone participating and working together. That's community building.
Today was a short day at #TESSninja for me, because I had [life events]. But in the morning, I spent some time working with [unnamed participants] and I managed, through my efforts, to fully bork their code. I guess I really, really don't understand Python packages. I felt bad about that. You are supposed to move fast and break things and fail fast but I often participate in projects in such a way that I feel like I make them worse!
I also spoke with Ellie Schwab Abrahams (AMNH) and Ben Montet (Chicago) about linear regression to calibrate a Kepler light curve. You can think of calibration as a kind of regression (predicting data using housekeeping data); we worked out what that would look like and got Schwab Abrahams on to gathering the housekeeping data.
My plan for #TESSninja is to work on automated approaches to radial-velocity follow-up of TESS discoveries. I am bringing some new things to this question. The first is that I am not going to ask “when should I next observe this planet candidate?”, I am going to ask “I have telescope time right now, which of my follow-up objects should I observe next?”. The second new thing is that I think that it is insufficient to make this decision only on the basis of information obtained in this observation. It should be made based on the future discounted information that it unlocks or makes available, under assumptions about observing into the future.
This second point was a breakthrough for me. It comes from this point: Imagine that you are using RV measurements to measure precise periods, and you want period information. The first observation you make gives you no period information whatsoever: It only constrains the overall system velocity! So you would never make that first observation if you cared only about the immediate information gain on period. You have to think about the future information-gain potential that your observation unlocks, discounted by your discount rate. Or even more complex objectives (yes, cash flow ought to be involved).
In other news, Guy Davies (Birmingham) made a nice point in discussion of the time-domain behavior of stars in an open cluster observed by Kepler: Because these stars ought to be the same age, and the same composition, and (on the red-giant branch) nearly the same mass, the asteroseismological (and jitter) signals ought to—in some sense—lie along a one-dimensional sequence in the relevant space. That's a great idea; I want to test that.
The highlight today of #TESSninja was Ashley Villar (Harvard) showing the lightcurve of a supernova discovered in the K2 mission, with models over-plotted. It appears that the supernova is a type Ia, but the early-time light curve (and K2 was observing it well before the start) is not consistent with any null type Ia models. The early time requires an interaction of the explosion with some nearby material, probably a companion star! This is an important discovery and (I think) a first!
Earlier in the day I worked with Ellie Schwab (AMNH) and Ben Montet (Chicago) on detrending a particular low-mass star that Schwab is interested in. We discussed how to combine the full-frame image information (where we know more about calibration and integrated photometry) with the long-cadence data (where we have a limited aperture and know less).
Today was the first day of Preparing for TESS, organized by Dan Foreman-Mackey (Flatiron) and others. It is organized like the #GaiaSprint in that it is a hack week, starting with pitches and dedicated to getting stuff done. The crew pitched some great ideas on day one and then hacked. I am trying to work on algorithmic approaches to efficient radial-velocity follow-up.
Melissa Ness (Columbia) and Megan Bedell (Flatiron) started an interesting project to follow up anomalous stars in an open cluster: Do the stars with element-abundance anomalies also show anomalies in the time domain or in asteroseismology? Many other projects are working towards obtaining cleaned or calibrated light curves, although my heart sang when various people (notably Rodrigo Luger at UW) pointed out that we don't want to de-trend, we want to have a model that explains every light curve as a combination of spacecraft and stellar variability (and planets).
Armin Rest (STScI) gave a nice talk about time-domain astronomy, with stuff about finding Earth-impactors and also light echoes. After his talk, I told him the insane project conceived by Rix, Schölkopf, and me about modeling the whole Milky Way as a set of flickering light sources and a three-dimensional map of dust, using time-domain imaging at very low brightness. That's probably not possible! Rest is part of a big new sky survey for near-earth asteroids, which will also do a lot of variable-star science.
After that, Sarah Richardson (Microbyre) talked about automating various aspects of phylogeny for various kinds of microbes. I was impressed by the robotics setups available to biologists! Her talk also contained a lot of biology-101 content for the physicists and engineers; I learned a lot (and felt, once again, my regret that I didn't take more biology in college!).
Late in the day, Josh Bloom (Berkeley) and I did some real-time decision-making at the Emoryville card room.
Today was road-traffic day at Real-time Decision Making at Berkeley. Jane MacFarlane (LBNL) and Alexandre Bayen (Berkeley) gave great talks about road dynamics. In MacFarlane's talk I learned that provided (by providers) mobile-phone location information is posterior information not likelihood information. And the priors are outrageously informative (like that every phone is on the midline of a known road!). That is good for the user (the mobile-phone owner), who wants navigation information, but not good for anyone trying to do hierarchical inference over phones or people! This is very related to the issues that Alex Malz (NYU) is working on in cosmology.
Bayen focused on the influence of mobile phones on traffic, which has been immense! As mobile phones have gained traction with drivers, they have driven traffic patterns to a non-optimal Nash equilibrium, where all paths from point A to B take the same amount of time. But these same phones also create crazy new nonlinear dynamics, because all drivers get re-routed simultaneously to a small number of alternate routes when something goes wrong. And it is like a repeating multiplayer game, because each routing company is constantly learning the dynamics induced by all the other companies! But this game is played out in the parameters of a set of differential equations, so it is crazy.
Things would be better if we could find a way to cooperate; this led to great lunch discussions with Josh Bloom (Berkeley). We discussed ways to capitalize on the fact that different drivers have different objectives. No existing apps capture this at all: They all optimize for the triviality of minimum expected travel time!
I arrived in Berkeley last night and today was my first day at a full-week workshop on real-time decision-making at the Simons Institute for the Theory of Computing at UC Berkeley. The day started with amazing talks about Large Hadron Collider hardware and software by Caterina Doglioni (Lund) and Benjamin Nachman (LBNL). The cuts from collisions to disk-writing is a factor of 10 million, and they are writing as fast as they can.
The triggers (that trigger a disk-writing event) are hardware-based close to the metal, and then software-based in a second layer. This means that when they upgrade the triggers, they are often doing hardware upgrades! Some interesting things came up, including the following:
Simulating is much slower than the real world, so months of accelerator run-time requires years of computing on enormous facilities just for simulation. These simulations need to be sped up, and machine-learning emulators are very promising. Right now events are stored in full, but only certain reconstructed quantities are used for analysis; in principle if these quantities could be agreed-upon and computed rapidly, the system could store less per event and then many more events, reducing the insanity of the triggers. And every interesting (and therefore triggered, saved) event is simultaneous with many uninteresting events, so in principle right now the system saves a huge control sample, which hasn't been fully exploited, apparently.
Of course the theme of the meeting is decision-making. So much of the discussion was about how you run these experiments so that you decide to keep the events that will turn out to be most interesting, when you don't really know what you are looking for!
First thing in the morning, I met with Judy Hoffman (Berkeley) to discuss her computer-vision and machine-learning work. She suggested that machine-learning methods that are auto-encoder-like could be repurposed to make predictions from one kind of data to another kind of data on the same object. For instance, we could train an encoder to predict exoplanet RV signal, given Kepler light curve. Or etc! This appeals to me because it uses machine learning to connect data to data, without commitment to latent quantities or true labels for anything. She pointed me (relatedly) to a new kind of model called ADDA, for which she is responsible.
In the afternoon, Chiara Mingarelli (Flatiron) gave the NYU Astro Seminar about pulsar timing and gravitational radiation, expressing the hope and expectation that this method will deliver signals soon. She told a very interesting story about a false-positive detection that nearly went to press when they figured out that it was resulting from residuals in the Solar System ephemerides. The SS comes in because you have to correct Earth-bound timings to a frame that is at rest (or constant velocity with respect to) the SS barycenter.
This isn't the first time I have heard this complaint. The astronomical community really needs an open-source and probabilistic SS ephemeris, so we can use the SS model responsibly inside of inferences. Freedom-of-information act time?