stars, planets, SPHEREx, and black-hole dark matter

In the last stars group meeting of the year, we had special guests John Brewer (Yale, AMNH) talking about the chemical abundances of stars hosting planets, Ellie Schwab (CUNY) talking about magnetic activity in low-mass stars and brown dwarfs, and Jackie Faherty (AMNH) talking about searches for long-period companions to solar-like stars. Brewer killed the diamond-planet hypothesis that was so cool a few years ago. Ue-Li Pen (CITA) commented to Schwab that 21-cm surveys (see yesterday's post) will and even already do have time-domain radio observations of thousands to millions of stars. And Faherty showed that searches for long-period companions have been incredibly productive, even though they haven't led to exoplanet discoveries (yet).

In the last cosmology group meeting of the year, we had special guests Roland de Putter (Caltech) talking about the observing plans for SPHEREx and Yacine Ali-Haimoud (JHU) talking about black-hole dark matter (my favorite theory of dark matter). SPHEREx performs a very cleverly designed 0.75-5 micron all-sky low-res spectral survey of every point on the sky. It will get redshifts for hundreds of millions of sources, with small photometric-redshift uncertainties. He talked about primordial non-linearity; the survey will get limits or a detection of fNL of <1. The audience was very interested in foregrounds, including Milky-Way stars, and even the zodiacal light in the Solar System.

Ali-Haimoud spoke about 2-body and 3-body effects in a black-hole theory of dark matter to get rates for LIGO. With careful re-analysis, he revises (heavily) the Ricotti et al 2008 limit on BHs as a dark-matter candidate and greatly weakens the constraints from CMB spectral distortions and anisotropies. But in the end he was very careful not to endorse black holes as a dark-matter canadidate. I'm stoked nonetheless!


reconstructing the initial conditions

Today was a 21-cm cosmology meeting at Flatiron. Unfortunately I could only do the morning. Ue-Li Pen (CITA) spoke about reconstruction using a simplified dynamics. I suggested that anything that could be done with simplified dynamics could be done better with machine learning. I think I can even prove this, since the machine learning could be trained on the residuals away from the simplified dynamics reconstruction! In his talk, however, he mentioned this incredible Wang et al paper that does full reconstruction of the initial conditions for the entire SDSS Main Sample volume! This gives me hope for the future of cosmology.


small planets are all rocky?

At lunch today, Angie Wolfgang (PSU) gave a talk at the CCA on hierarchical inference of small and rocky exoplanet population properties. She made a nice set of arguments for the hierarchical Bayesian methodology, which was preaching to the converted (but good). She showed her results on exoplanet compositions and H/He envelopes, both of which are impressive, and then she went on to look at parametric and non-parametric fitting of the mass–radius relationship at small radii. She excludes zero mass scatter at fixed radius at all except the smallest radii. There is a consistent story emerging that the very smallest planets are indeed (pretty much) all rocky.

I had some back-and-forth with Megan Bedell (Chicago) about derivatives of our spectral model with respect to parameters. She has this all correct but I recommended parameterization changes, and the whole time, in the background, Dan Foreman-Mackey (UW) was saying things like “you should never take your own derivatives” and “you should use Theano”. I ignored him, probably at my peril.



Andy Casey (Cambridge) organized a call today in which we discussed his project to re-factor the ESO HARPS archive of high-resolution stellar spectra into a more useful form. We discussed our needs for this new archive for our own personal science projects. The idea is to have the schema and work on the archive be directly related to short-term science goals, and then hope that it will be useful for many other goals.


map-making, self-calibration, stellar rotation

In my stars group meeting at CCA, Ruth Angus (Columbia) told us about her work to replace standard methods for determining stellar rotation with a probabilistic model, based on Gaussian Processes with a quasi-periodic kernel. This seems to work extremely well! There are some pesky outlier stars, even in simulated data, in which all methods (including Angus's best) seem to get the wrong answer for the stellar rotation; these are interesting for further investigation.

In my cosmology group meeting, I got fully taken to school. First Brian Keating (UCSD) told us about self-calibration of CMB polarimetry. It turns out that you can't self-calibrate for some of the most important science. That is, your signal might be in exactly the modes to which the self-calibration is orthogonal! That's bad. And an aspect of self-calibration that I haven't thought about before. He discussed many crazy and creative ways to do absolute calibration of polarimetry devices; none of them look good enough (or cheap enough) at this point.

Then Colin Hill (Columbia) told us about map-making things he is working on in CMB data. I got all crazy because he is only considering linear combinations of observed data to (say) produce the thermal S-Z map (from, say, Planck data). But then he pointed out (correctly) that all least-squares methods return linear combinations of the data! Oh duh! All L2-like methods return linear combinations of the data. So then we went on to think about combined L1 and L2 methods that could permit him to open up his model space without enormously over-fitting. At the end of the discussion I had a job to do: Write down the largest class of convex map-making methods I can, given what I know about L1 and L2.

In between group meetings Cameron Hummels (Caltech) talked about open-source codes he is building that take simulation outputs from cosmological hydro simulations and predict observables, especially those that relate to the inter-galactic medium and circum-galactic medium. We talked a lot about the differences between resolution effects and sub-grid physics choices, which are confusingly inter-related.


Spitzer Oversight, day 2

Today was the second day of the Spitzer Oversight Committee meeting. Far and away the highlight for me was a presentation by Sean Carey (IPAC) about the overall health of the spacecraft and the imaging instrument. He showed the photometric throughput of the system as a function of time, which has been amazingly constant (sub-percent level), and yet showing a consistent and repeatable trend (of less than one mmag per year, ish?). He showed the bad pixel count, which has risen linearly with time, but only to a few hundred (and many of those are nonetheless still calibrated and useful pixels). He showed the astrometric wobble and drifts, associated with spacecraft thermal events. These are substantial, but changing, as the spacecraft goes through more and more extreme solar-angle events to downlink its data to the Deep Space Network. Spitzer is in an Earth-trailing orbit, so as time goes on, it has to point at worse and worse Sun angles to send its data home.

The latter point is very interesting: Carey showed that the batteries are still just as good as new, but that the spacecraft draws 11 A (yes, 11 Amps at 300 Volts!) on average (don't ask why), and when it is in downlink to Earth, the solar panels are not getting enough insolation to cover it. This leads to non-trivial scheduling in which the spacecraft must point near-orthogonal to the Sun vector for a while (hours) after downlink. This complexity is handled without issues by the non-trivial scheduling systems, and the overall spacecraft health is excellent. The mission can run until March 2019, when it is cut off, both by funding and by these Sun-angle issues. Carey also showed the long-term performance of the solar panels. This also declines linearly with time, but is easily within spec to keep the mission running (barring any severe micrometeorite hit). Knock wood!

It has been a great two days, with an absolutely great team, working on an absolutely great mission.


Spitzer Oversight, day 1

Today was the first of two days at the Spitzer Science Center, where I am (for the 8th year in a row) helping to advise the Spitzer mission as part of the Spitzer Oversight Committee, chaired by Mike Hauser (STScI, retired). The mission is in its last years, funded to continue observing to March 2019, then closing out for a year-ish afterwards, with no opportunity for further extension. Its lifetime is not set by money alone, however: Geometric constraints on its downlink to the Deep Space Network and its power and insolation and thermal needs during that downlink take the spacecraft out of safe operating conditions in 2019.

One of the things that was discussed today was synergies with JWST and TESS. There will be more than a year in which TESS and Spitzer are simultaneously flying. This creates a lot of interesting opportunities. I made a mental note to discuss this with some of my more ambitious exoplanet people. The JWST Early Release Science call has been released, and the calibration and photometry things we have learned (over many years) in the Spitzer, Kepler, and SDSS contexts (and Euclid and LSST contexts) could form the basis of a great ERS proposal. Let's not wait for five to ten years to figure out how to do the best possible science with JWST! Again, if I am going to go there, I need to assemble the right team.

Before the start of the meeting, I had lunch with Kat Deck (Caltech). We had a wide-ranging conversation, which included exoplanets, proto-planetary disks, the theory of planet formation, and gravitational radiation backgrounds and sources. On the latter, I really have to get involved; there are so many simple questions at the intersection of theory and data analysis.


transits of swarms of debris

I met with Ellie Schwab (CUNY) and Kelle Cruz (CUNY) to discuss Schwab's model of stellar activity in low-mass stars. We checked her MCMC sampling diagnostics and worked out how to make her model more general. It is a mixture model, with an active and inactive population of stars, mixed.

I met with Caroline Kaler (NYU) to get her started looking at the Kepler data. Inspired by my visit to Rochester this week, I have her looking at the Boyajian Star. I have a crazy thought that we might be able to use the smoothness (or not) to limit (or measure) the number of bodies contributing to the light-curve events, if those events are multi-object transits.


detailed abundance trends with binarity

Taisiya Kopytova (ASU) arrived in New York today for two days to work on the detailed chemical abundances of giant stars in APOGEE that host substellar and low-mass stellar companions. We have cast this problem as a very simple comparison between the companion-hosting population and a control sample. The control sample is carefully selected to be identical in stellar parameters. It looks like she has a result, and the result is interesting. It isn't a large effect, however. We discussed how to show it, convincingly, and how to describe it, accurately.

Lauren Anderson (Flatiron) and I continued to debate how to put spectroscopic labels onto Gaia TGAS stars without spectroscopy, and how to (before that) de-noise the labels themselves, which we are getting from the RAVE-on catalog of Andy Casey et al. We are a bit confused about how principled to be in the latter de-noising. It seems crazy to proceed without doing it, but building a full hierarchical model seems like overkill when the main point is just that stars on the main-sequence must have large surface gravities!



I spent the day at the University of Rochester, where I gave the Physics Colloquium. I spoke about data-driven models. Before my talk, I had many interesting and valuable conversations with faculty and students. One highlight was work that Alice Quillen (Rochester) is doing on tidal dissipation. She is building mechanical models of solid bodies (think: planets) to parameterize tidal dissipation and look at tidal locking mechanisms, and spin–orbit resonances and dynamics.

Another highlight was a long conversation with Eva Bodman (Rochester) who (among other things) has been looking at extra-solar comets in the Kepler data. We discussed things she has done, but also the low-hanging fruit for future work on comets around other stars. She has built a model of the strange behavior of the Boyajian Star in terms of a (bizarre, huge) comet population; this made me think that there are lots of things we might do with comet population models, or other models of swarms of debris.


foregrounds and optimization

It was a low-research day today. But I did get in a short and valuable discussion of CMB foregrounds with Boris Leistedt (NYU). The approach I want to pursue is to make a latent-variable model, which posits a set of scalar fields, and nonlinear functions that convert them into (high resolution) maps, that are compared to the data through the relevant beams. I think this will (almost provably) beat current approaches. I also had some conversations with Bedell about optimization. We are trying to fit for stellar spectra and radial velocities, and (as usual) we are finding that out-of-the-box optimizers don't work well!


better exoplanet searches through chemistry

Today I had the great privilege of spending the day with the group of Karin Öberg (CfA) at Harvard and also the NG Next team. Öberg's group is doing so many great things, related to astronomical observations of proto-planetary disks and also real lab experiments on ices and solid-state chemistry relevant to interstellar and accretion-disk physical conditions. Here are some highlights:

Ellen Price (CfA) showed a consistent chemical model in which they evolve the molecular contents of gas as it orbits in the evolving accretion disk. This is based on a NLTE chemical model built by Ilse Cleeves (CfA). She can see big changes as gas crosses the snow line (or various snow lines for different species). Edith Fayolle (CfA) showed absolutely incredible ALMA observations they have (with Cleeves and also Ryan Loomis, CfA) of proto-planetary disks around young stars. In these observations, there is so much, I could fill a whole separate set of blog posts: They see various kinds of organics that weren't expected to be formed in abiotic conditions. They also can image the disk in two spatial dimensions and the radial velocity dimension in thousands of chemical species. This is unprecedented detail on a disk, and also unprecedented information about molecules in these conditions. We discussed ways we could simultaneously model all of this and make very sensitive measurements of what is going on at various laces in the disk. As part of this discussion, Öberg and I discussed the problem that there isn't a good out-of-the-box imaging pipeline for ALMA, in part because different users with different targets have very different priors and goals.

But then we switched to lab stuff! Mahesh Rajappan (CfA) described the Öberg-lab experimental setups, in which they can deposit ices, including multilayer things, and then radiate them or heat them, to measure solid-state chemistry processes directly. Jennifer Bergner (CfA) is doing lab experiments to find and measure configurational rate constants for chemical processes in ices. These rates relate to the processes by which molecules find one another and reorient to permit solid-state reactions to take place. She was working in particular on O+CH4 to CH3OH. One theme of the day's conversations is that the organic chemistry of proto-planetary disks is seriously complex and contains everything that is needed for life (we think).

At the end of the day we discussed research synergies. I think the biggest is in building consistent models of the thousands of molecules, in the dynamical disk. One incredible idea is that a forming gas-giant planet should be hot (gravitational or accretion energy); this could affect the local chemistry in the disk: We could see the thermal signature of a forming planet in molecular species! That's a great goal for the near future. Öberg's group (and especially Cleeves and Loomis) have the data in hand, or coming when ALMA gets to their targets.


stars, disruption, photometric redshifts

Today began with a meeting about GALEX, where Steven Mohammed (Columbia) showed that there is great metallicity information in the overlap of GALEX and Gaia, and we discovered that something must be seriously wrong with the astrometry in our re-calibration of the data.

Andy Casey (Cambridge) organized a phone meeting in which a bunch of us discussed possible scientific exploitation of the data in the ESO HARPS archive, which contains thousands of stars, each of which has tens to thousands of epochs, each of which is signal-to-noise of hundred-ish, and resolution of 100,000. Incredibly huge amounts of data. Huge. Casey asked each of us to describe low-hanging fruit, and take on short-term tasks. One thing we might do is re-factor the archive into something more directly useful to investigators.

Sjoert Van Velzen (JHU) gave the astrophysics seminar about tidal disruption events. He has a great set of results, starting from search and discovery, going through theory and models, and continuing on to multi-wavelength follow-up. The most intriguing result is that the TDEs are amazingly over-represented in post-starburst (E+A) galaxies (which I used to work on). It is hard to imagine any origin for TDEs that would so strongly concentrate them into these environments. It makes me wonder whether the things they are seeing aren't TDEs at all?

After the seminar, Boris Leistedt (NYU) posted to the arXiv our new paper on photometric redshifts. The idea is that we use what we know about Doppler Shift and bandpasses and calibration of photometry, but let the galaxy SEDs themselves be inferred, latent variables. This combines the best properties of machine-learning methods (that is, flexibility, non-parametrics) with the best properties of template-based methods (that is, regularization to physically realizable models, a generative model, and interpretability). It seems to work very well!



It's job season and my head is only just above water! Adam Riess (JHU) gave a nice colloquium at NYU today about the distance scale, and the comparison between the distance ladder and the cosmic microwave background.


Solar twins, probabilism, model complexity, spectral hacking, and so much more

Today was the usual research-packed day at Flatiron. In the stars group meeting, Megan Bedell (UChicago) told us about her multi-epoch survey of Solar twins. Because they are twins, they have similar logg and Teff values, so she can get very precise differential abundances. Her goal is to understand the relationships between abundances and planets; she gave us mechanisms in which the stellar abundances could affect planet formation, mechanisms in which planet formation could affect stellar surface abundances, and common causes that could affect both. She has measured 20-ish detailed abundances at high precision in 88 stars with (because: multi-epoch) SNR 2000-ish!

Doug Finkbeiner (Harvard) and Stephen Portillo (Harvard) told us about probabilistic catalogs; a project they are doing that builds on work Brewer, Foreman-Mackey, and I did a few years ago. They find (like us) that a probabilistic catalog—a sampling of the posterior in catalog space—can find fainter sources reliably than any standard point-estimate catalog, even one built using crowded-field software. They use HST to deliver ground truth. They aren't going fully hierarchical; we discussed that in the meeting, and the relative merits of probabilistic catalogs and delivering an API to the likelihood function (my new baby).

Neven Caplar (ETHZ) went off-topic in the meeting to describe some results on the time-variability of AGN. Sensibly, he wants to use time-domain data to test accretion disk models. He is working with PTF data, which he had to recalibrate in a self-calibration (he even shouted out our uber-calibration of SDSS). He is computing structure functions (which look random-walk-like) and also doing inference in the context of CARMA models. He pointed out that there must be a long-term damping term in the covariance kernel, but no-one can see it, even with years of data. That's interesting; AGN really are like random walkers on very long timescales.

In the cosmology group meeting, Phil Bull (JPL) worked us through a probabilistic graphical model that replaces simple halo occupation models with something that is a bit more connected to what we think is going on with galaxy evolution. Importantly, it permits him to do large-scale structure experiments with multiple overlapping tracers from different surveys. Much of the discussion was about whether it is better to have a more sophisticated model that is more realistic, or whether it is better to have a simpler model that is more tractable. This is an important question in every data analysis and my answer is very different in different contexts.

Between these two meetings, Bedell and I worked out the simplest representation for our Avast model of stellar spectra and Bedell went off to implement it. She crushed it! She has code that can optimize a smooth model given a set of noisily measured different epochs, accounting for differences in throughput and radial velocity. Not everything is working—we need to diagnose the optimizer we are using (yes, optimization is always the hardest part of any of my projects)—but Bedell did in one afternoon more than I have got done in the last three months! Now we are in a state to make bound-saturating radial-velocity measurements and look for covariant spectral variations in an agnostic way. I couldn't have been more excited at the end of the day.


radial-velocity precision

Today was a low-research day. Megan Bedell (Chicago) arrived in NYC for the start of a two-day visit. We discussed our plans; we want to actually accomplish something this week, related to our project to find stellar spectral variations that co-vary with stellar surface motions (to improve radial-velocity measurement precision).


stellar masses without models; light scalar dark matter

I discussed with Lauren Anderson (Flatiron) our project to use photometry and parallax to transfer spectroscopic labels to stars without spectroscopy (and, first, to de-noise the spectroscopic labels). This got me confused about how to explain the project to spectroscopists and non-spectroscopists alike: We have a way to use Gaia parallaxes to put logg values onto stars, but making no use whatsoever of stellar structure or evolution models, nor even scalings. Not even in the training set of labels! Indeed, I think we have a way to measure stellar masses with no use of physical models of stellar structure. I called Hans-Walter Rix (MPIA) to discuss further.

At lunch time there was an excellent brown-bag talk on light scalar dark matter by Ken Van Tilburg (NYU). He made beautiful, simple arguments about computing the properties of light scalar dark matter, and also very simple arguments about limiting the mass scale. When the dark matter gets very light, it becomes like a field of radio waves, but with a strange dispersion relation (because the particle rest mass isn't zero). This leads to highly observable effects. Huge interesting regions of parameter space are unexplored, experimentally, but there are prospects for both astrophysical and laboratory tests. There is an interesting regime at the massive end, where occupation numbers get small and the dark matter could even show macroscopic wave–particle duality effects. Overall it was highly educational, and a perfect example of the interdisciplinarity of the CCPP.


#ken75, day 4

Today was day 4 of Galactic Archaeology and Stellar Physics. The day ended with my summary talk, which was unfair, scoldy, and mean and which was shouted (by me) over slides available here. Those slides will be incomprehensible without the things I said alongside them, but they will give you a sense of what themes I assembled (in real time) from the talks. A few non-representative highlights from today:

Binney kicked off the day with a discussion of analytic modeling of galaxies. His talk contained many valuable insights. For example, he showed that very simple distribution functions (in action space) can nonetheless create very non-trivial distributions in configuration space. He gave his usual—but excellent—argument for working in action–angle space: It is a consistent, continuous, conjugate coordinate system in which inference is possible. He also showed that high-quality modeling of the nearby Solar neighborhood can make good predictions for the position—velocity relationshiops for a larger patch of the Galaxy; that is, good modeling makes for highly predictive theories. At the end of his talk, he was asked about a radical dark-matter model, and he answered in terms of researcher utility, which was music to my ears.

Wegg showed an extremely good model of the mass function and microlensing optical depth towards the bulge, and constrained the dark-matter fraction. His results rule out a strong cusp in the dark matter, which is consistent with other things that were said at the meeting.

Grillmair and Carlberg talked about cold stellar streams. Grillmair showed marginal evidence for many more streams, consistent with a steep mass function in such objects. He name-checked our work on chaos and stream fanning. Carlberg shocked me by saying that the stellar streams are shorter than expected in theory. I just straight- up disagree with that, but maybe he is right when you take the full complement of streams together.

Côté brought us back to the subject of nucleosynthesis. He showed that there are many competing nucleosynthetic models that can produce the same data, but if you look on the inside, they imply very different things about the latent (physical) parameters. He breaks some of these by looking to LIGO and the rate of neutron-star mergers, which are probably involved in r-process. I loved the connections he drew between stellar chemical abundances, nuclear physics, and gravitational wave astronomy.


#ken75, day 3

Day 3 of Galactic Archaeology and Stellar Physics opened with talks by Lind and Ness about traditional and new ways of measuring stellar parameters and chemical abundances. Both of them were effectively very critical of the traditional method, where there are large inconsistencies in atomic assumptions between giants and dwarfs, and there are many nuisances that affect the data as strongly as the chemical abundances in question. Lind also compared 1D and 3D models, and LTE and NLTE models and made some general statements about each quadrant. She implicitly suggested that limb darkening (or the spectral version of that) and time-domain spectroscopy might both be filled with information, because some of the 3D effects show up most strongly in the variations of the spectrum with position and time.

These introductions were followed by a set of talks that assess various aspects of the feasibility of chemical tagging—finding pairs or groups of widely separated stars that were born together in the same molecular cloud or association. This subject is dear to my heart! Blanco-Cuaresmo clearly articulated the two questions of chemical tagging, which have also come up here in this forum a few times. My phrasing of these questions would be the following: Two stars that were born together: How different can they be? And: Two stars that were born apart: How similar can they be? He then proceeded to do stuff with PCA and k-means that I didn't love; I don't think vanilla machine learning will solve this problem. However, he did (inadvertently, it seems) show great evidence that chemical tagging is conceivable. Similarly Kos did things with t-SNE that I didn't love, but which also show great evidence for an optimistic view! Carrera showed that open clusters have amazingly uniform chemical abundances. Ting showed argued that we might have to take a more probabilistic approach to chemical tagging than the original hopes. He called this the “pessimistic regime” of chemical tagging; no reason for pessimism there, but I get why he called it that.

In related news (and related to things Price-Whelan and I talk about), Mike Ireland showed an example of running the clock back on Gaia TGAS (plus spectroscopic RVs) to find the ages of disrupting stellar associations. He finds that you get more accurate ages if you take a probabilistic approach, which is music to my ears.

The afternoon was dominated by the Galactic Bulge, which appears to have sub-components formed by monolithic collapse and by long-term evolution out of the disk. The X-shape is primary evidence that the latter is the dominant process, though controversies continue. Unfortunately only one speaker showed the absolutely gorgeous Ness & Lang image, which I have the honor to have elicited with a tweet (tm) a year or so ago.

The day ended with Andy Casey talking about anomalies in the Solar abundances that run systematically with condensation temperature. Tantalizing to think it might have to do with the fact that the Sun hosts rocky exoplanets! These anomalies exist all over, however (or so it seems). They probably have something to do with dust depletion and dust accretion, which can spatially separate the high condensation-temperature elements from the low condensation-temperature elements. The talk was a reminder of how hard it is going to be to get a straightforward interpretation of anything in the high dimensional chemical-abundance space.


#ken75, day 2

Today was day 2 of Galactic Archaeology and Stellar Physics. Again, a great day; here only a few highlights:

Else Starkenburg gave a review on first stars. There is no true first star known—nothing with primordial abundances—but there is one at or near −7 (that is, 7 orders of magnitude below Solar. This star, like many extremely metal-poor stars, is low in iron but very high in carbon relative to iron. That is a mystery, with many conceivable solutions. Starkenburg spoke about binary-star (mass transfer) ideas. The talk left me wondering: Do we know what a primordial-abundance star would look like? Afterwards, Schlaufman argued to me that we do, at least pretty accurately.

This was followed by a bunch of other low-metallicity star talks. DaCosta, Venn, and Schlaufman all spoke about searches for extremely metal-poor stars using clever photometric techniques. This ties in with my comment yesterday that we might be able to do a lot of Galactic Archaeology science with photometric surveys (possibly backed up by spectroscopy for calibration or training). It also bodes extremely well for Gaia Bp–Rp narrow-band photometry, which will be laden with stellar information.

Stello and Huber gave talks about asteroseismology. In Stello's review, I learned that the brightness variations are temperature variations (or really temperature–size variations), not pure size variations. This surprised me, and then was immediately obvious: The atmosphere reacts adiabatically to fast changes. He also very clearly connected the mode properties to the stellar properties, and explained the important point that dwarfs, subgiants, and red giants have different physics connecting their seismic modes to their masses, ages, and bolometric luminosities. Huber compared existing asteroseismology to Gaia data, showing that there is a consistent story, but also showing that for almost all asteroseismic targets, the asteroseismology will provide more precise distances than even end-of-mission parallaxes.

There were a set of nucleosynthesis and supernovae yield talks. My personal highlight here was a talk by Hampel about neutron capture physics. She starts with the observation that between s-process and r-process there is a whole range of neutron densities, and at different densities, you get different abundance yields. She then used real stellar data to measure the neutron density for an intermediate neutron density between s and r, calling it i. This talk stood out among the nucleosynthesis talks for its containing (like Stello's talk on asteroseismology), clearly explained fundamental physics.


#ken75, day 1

Today was the first day of the meeting Galactic Archaeology and Stellar Physics in honor of Ken Freeman (MSSSO). As per usual when I am at a meeting, this blog can't convey the full day of talks, so I will just put here very personal highlights.

Freeman opened the conference, giving his overview of what he wants to know about the Galaxy. He is excited about the revolution happening now in which we might have 6-d phase space and 30-ish chemical abundances for stars all over the Galaxy. He brought up two themes that would be very important in today's talks, the bimodality in the alpha/Fe distribution (and its connection to different disk components), and radial migration. On the former, he uses the bimodality to separate the thin and thick disks; he is so confident that he literally calls a chemically separated component the “thick disk”. On the latter, he showed some results I hadn't seen on velocities as a function of metallicity that he argued make the radial migration clear. I have to figure that out! Relevant to things we worked on in the Gaia Sprint, he asked whether the disk components are different heights because of heating or a big event. I think we now know that at small heights it is heating. But the alpha-rich component might be thick because of an event.

Hekker discussed the SAGE project to get a uniform catalog of masses and ages for stars out of non-uniform inputs. She referenced The Cannon but is taking an opposite tack: She is trying to homogenize the data by making all the data constrain the same physical model.

Ruiz-Dern discussed red-clump stars, and in particular building a data-driven model of the relationships between spectroscopic parameters and photometric colors. She showed very good evidence that we could do a lot of the science we do with spectroscopy with photometry instead! That was not her goal, but it got me thinking in a totally new way about my project with Lauren Anderson.

Bovy discussed his results of dissecting the Galaxy into narrow chemical-abundance slices. Where Freeman had used the differing amplitudes in the alpha/Fe bimodality as a function of position to show how different different parts of the Galaxy are, Bovy used the same data to show how similar different parts are! That's a great property of a good scientific result: It can be interpreted either way! He discussed in detail what aspects of his Galaxy decomposition results are consistent and inconsistent with ideas from radial migration.

Talks by Duong and Chiappini again used chemistry to investigate the thin and thick disks, and Chiappini explicitly warned the audience that we will get different results if we split the Galaxy on chemical or structural lines. This also mirrored comments by Bovy.

Toyouchi looked at explaining the alpha/Fe bimodality with an event in the Milky Way's past. This got me thinking about the question: How can we tell whether the bimodality is a fundamental property of the chemical enrichment of molecular clouds or whether it is just the result of some very specific event in the Milky Way's particular past?

Feuillet showed amazing age-abundance relationships for the 19-ish elements that APOGEE observes. It is a goldmine of empirical results. She finds a few highly problematic elements. Like us, she finds that alpha/Fe is strongly correlated with age, at all alpha/Fe values and at all ages.

Buder talked about the GALAH survey and what has been learned and improved with the Gaia DR1 TGAS release. He announced that GALAH is using The Cannon as part of its data analysis pipeline. He said (and I believe him) that they are using it to speed up the code. I like that; it's good for my brand!


APOGEE and time variability

The day started with a conversation with new NYU graduate student Marc Williamson about a project I would like to do in the APOGEE data, looking for time variability in stellar spectra. There should be variations there, because of convection and star spots; the question is whether we can see it, and whether we learn anything about stars from it. I am optimistic. It also connects to ideas I have about improving radial-velocity measurements.

I had lunch with Jo Bovy (Toronto), with whom we discussed what Gaia and APOGEE will jointly reveal about the Milky Way. I like to say that there is great complementarity, because APOGEE is infrared, and its red giants span much of the disk, while Gaia is optical and its sensitivity is best in the halo. However, Gaia is a very sensitive system overall, so it is a detailed question just how much or little Gaia data we will get on the APOGEE giants. Of course I ended up deciding that even if we do get lots of Gaia information about APOGEE targets, that only strengthens my view!


emission-line galaxy spectra

The day opened with a conversation with Guangtun Zhu (formerly JHU) who has been doing great things with the eBOSS and MaNGA spectra from SDSS-IV. On the former, he has made a composite (average) spectrum and can see many things that haven't been seen before in galaxies like these. He can see fluorescence from the outer ISM (or maybe IGM) and he can see the effects of other extremely weak emission and absorption lines. He can also see that the emission lines are due to outflows, but in great detail: Different lines with different relative amounts of absorption and emission have different profiles and he has a consistent story for all of these.

I ended the day by working on the text in the paper on image modeling (image differencing) by Dun Wang (NYU) and in the paper on data-driven galaxy SED models by Boris Leistedt (NYU).


substructures, spots, gender, filtering, and more

Today was the usual action-packed day at the CCA. Before the group meetings, Lauren Anderson (CCA) showed me work on stellar twins in photometric (APASS + Gaia) space, and how similar they are in RAVE-on spectroscopic parameters. It looks like she might be able to put (through a machine-learning-like method) spectroscopic parameters on every star in TGAS. This would be yet another data-driven model of stars.

At stars group meeting, Tjitske Starkenburg (CCA) and Keith Hawkins (Columbia) discussed this paper about chocolates which seems to be the only paper so far about new substructures in the Gaia data. There was some discussion about how they converted parallax to distance (apparently Binney has an opinion), and how the sub-structure is found via a cross-correlation between the data and random realizations. The evidence is a bit weak. However, some of the substructures look worth following up in chemistry and in other stellar samples.

This was followed by Michael Gully-Santiago (formerly Kavli, Beijing) showing his work on figuring out the ages and masses of really young (few Myr) stars, and (more ambitiously) getting the spectra of star spots on their surfaces! His project is very related to our spectroscopic binary work: How to measure a cold spot on a hot star? His approach is to modify the (Czekala et al) Starfish code to also fit for cold patches (at the same metallicity and logg as the main surface). For his particular case (LkCa-something), he finds a preference for a spot temperature of 2700-ish K covering 80-ish percent (!) of the star. This was followed and interrupted by lots of discussion about stellar binaries, long-term evolution, longitude effects, and so on.

Between group meetings, Sandro Tacchella (ETHZ) talked about his paper on gender bias in astronomy. The paper gathers good data, and is (properly) limited in its conclusions. It ends with a relatively sophisticated causal inference that is fairly convincing that women are cited less than men for papers with otherwise similar properties. This involved building a predictive model for citations. That led to a good discussion!

In the cosmology group meeting at the end of the day, Kris Sigurdson (UBC, NYU) spoke about CHIME 21-cm observations and foreground mitigation. Foregrounds should be smooth in the frequency direction, unlike the narrow-band 21-cm emission. We discussed the differences between filtering-like (or K-L-like) techniques that deliver minimum-variance modes, and subtraction-like techniques that try to build an explicit model of the foregrounds. We vowed to continue the discussion.


another data-driven model of stars

My people are tired of hearing endlessly about data-driven models of stars. But today Boris Leistedt (NYU) created a new one, and I am extremely excited about it.

The idea—which is more-or-less my unattempted Gaia Sprint idea—is to build a very flexible model in color–magnitude space, and then generate noisy parallaxes. That is, a hierarchical model for the parallaxes, with a color–magnitude diagram that is learned simultaneously. Today, Leistedt had the breakthrough that this could be done in bins in color and magnitude, with a Dirichlet model. That is out-of-the-box inference; he got it working and it looks nice! This is all on the path to removing physical models from (what you might call) the Gaia distance ladder (which starts at parallaxes, and ends with some kind of distance estimate for everything that can be detected).

(The first sentence of the second paragraph of this post uses all three kinds of dashes: em, en, and hyphen. Bring it on, typography nerds!)


Saturn's rings and moons

My nearly-null research day was saved by a brown-bag talk by Bob Johnson (UVa, NYU) about Saturn's moons and rings, and what we have learned from the Cassini Mission. The spacecraft takes images but also in-situ spectrometer measurements, so it can measure interplanetary plasma. There are many surprises, including oxygen ions all around, which means (for example) that Titan has a lot of chemical diversity. Johnson's research is about understanding the plasma physics given heating, transport, and chemical processes. The system is very seasonal, as the rings get heated from below, then edge-on, and then above. And there are apparent changes in the plasma near the co-rotation of the magnetic field and the rings.



Today's astro seminar was by Pieter van Dokkum (Yale), about the Dragonfly project and their discovery of large, dark-matter-only-ish galaxies. That is, they have found galaxies that have way fewer stars per unit of dark-matter mass than anything previously. That's exciting. We had lots of discussion during the talk and before about the technical value and also challenges of working with small telescopes, and in particular refractors. van Dokkum put a lot of weight on the existence (in the digital photography world) of nano-scale structured anti-reflection coatings, which are available in (high end) commercial lenses but not in refracting telescopes. He discussed the value of having no supports in the aperture, and no mirrors (yes, mirrors are bad for ultra-low surface brightness). There are many connections between this work and the future coronography we need to do to directly image exoplanets. Dragonfly is a beautiful, sensible, low-cost project that is incredibly successful.


empirical binaries

My only real research today was a conversation with new NYU graduate student Shengqi Yang, who is going to start on a project to look at binary stars in the APOGEE data as linear combinations of single stars. That is, a purely empirical model (for now). We discussed how to find similar stars, and how to decide that they are similar, and what a binary star would look like in the continuum-normalized spectral space. The nice thing about the problem is that, in principle, a search for binaries among N spectra might take N3 time, since any star could be a mixture of any two others! However, we are going to start with the Gaia TGAS overlap, which reduces the numbers an also trims the tree, because there are many triples that—given what we know from Gaia—couldn't be so related.


image differencing and stellar rotation

As always, the stars group meeting at the CCA was a research paradise. We opened (at the suggestion of Ruth Angus) by acknowledging that a terrible thing had happened in the United States and that we were collectively sorry about this. That was necessary.

We moved on to a presentation by Dun Wang (NYU) of his image modeling for the Kepler K2C9 imaging campaign to measure microlensing events in the Galactic Bulge. He described his radical technique for image differencing, which has produced (I think) the most precise image differences in the history of astronomy! They have requirements, however, that not all future time-domain imaging surveys would successfully meet. He has some really impressive movies, and we stared at them like zombies for quite a long time.

After that, Ruth Angus (Columbia) presented the recent paper by Davenport about stellar rotation, informed by the Gaia DR1 TGAS data (and thanking the Gaia Sprint!). The paper confirms and extends a discovery of a gap in rotation-inferred ages of stars, which either points to some kind of non-linearity in the evolution of stellar rotation, or else points to a really non-trivial event in the star-formation history of the Galaxy. It could also be that there are just different classes of stars (in terms of rotation). We argued about all this. Hawkins (Columbia) recommended that we look at the relevant stars in chemical-abundance space; this should resolve some of the possible explanations.

After group meeting, Tim Morton (Princeton), Adrian Price-Whelan (Princeton), and I sat with Semyeong Oh (Princeton) to go over the draft (and especially figures) of her new paper on co-moving stars in Gaia TGAS. Lots of great stuff in there, and a valuable tool for stellar astrophysics.



Today was a total fail, making it two days in a row. The political events of the United States sure don't make for a conducive environment for getting research done.


not much

In a low-research day, Hans-Walter Rix (MPIA) called me to discuss double-lined spectroscopic binaries, and what we might be able to do in the low-hanging-fruit category. I gave him some of the ideas that we have been playing around with in the context of The Cannon. It was basically a fail today, though, research-wise.


the 21-cm signal from reionization

Adrian Liu (Berkeley) gave today's Astro Seminar. He spoke about the radio instruments PAPER and HERA, which are trying to see the 21-cm signal from reionization. He showed reasonable evidence that this signal will be strongest at around redshift 8. HERA is being built as a pre-cursor mission to the SKA. PAPER is done, and although the final data analyses are not complete, Liu claims that it has the best limits on the 21-cm emission from reionization, and they are a factor of tens higher than the expected signal. So there is work to do. He admitted that they are not doing the best they can on foregrounds: They are avoiding rather than modeling the foregrounds. After the talk we discussed this a bit; it connects to the things I am interested in regarding stage-four cosmology.

Late in the day, Megan Bedell (Chicago) pointed out to me that there are hundreds of stars with reduced, high resolution, high signal-to-noise spectra in the HARPS archive. Almost no-one (to our knowledge) is doing anything with these data. Crazy! We discussed plans to exploit this amazing data set.


how precise is The Cannon?

It was a low-research day! But Hans-Walter Rix (MPIA) called me to discuss the endless problem that we don't have label uncertainties that we believe in the output of The Cannon. The context is Ness's work to measure the abundance variability within open clusters (which are famously close to single-abundance populations).

Our formal uncertainties with The Cannon are tiny, but under-estimated because they don't properly account for the choices we made in optimizing the internals. Our cross-validation uncertainties are much better, but still over-estimates because they effectively include systematic terms that go beyond precision. That is, if we only care about precision in a single cluster, the cross-validation is an over-estimate. And we can see that empirically, because with a single-abundance fit we get chi-squared values that are much smaller than the number of degrees of freedom.

My view is that we should use the open clusters themselves to set the uncertainties. This sounds circular: How can we estimate the intrinsic abundance spreads if we set our observational uncertainties assuming that the spreads are zero? But it isn't: For one, different open clusters are different in their abundance spreads. For two, there are long tails of abundance differences even in the best clusters. For three, even if there were neither of these effects, we would still get great upper limits!

The long-term solution is to go fully Bayesian. I became motivated to work on this now. I owe ideas about this to various people, including Jonathan Weare (Chicago).


group meetings awesome

My goodness Wednesdays are good research days! As per usual, I spent the day at the CCA, where the research activity was dominated by my two group meetings. Here are some notes I took at each:

Sarah Pearson (Columbia) introduced herself at the stars group meeting; she talked about stellar streams and using them to constrain the shape of the MW dark-matter halo. With Pal 5 stream morphology alone, she can show that the Law & Majewski potential (used to explain Sagittarius) must be wrong, because it would create very strong stream fanning (which happens in chaotic potentials). Pearson is also thinking about finding thin streams in external galaxies, because morphology alone is so potentially constraining.

After Pearson, I asked about the projects from the Gaia Sprint, and their statuses. Leistedt (NYU) said things about modeling the color-magnitude diagram. He complained about the difficulty of fitting mixtures of Gaussians, especially because of initialization-dependence. Morton (Princeton) talked about his code to fit stellar properties given photometry and parallaxes. He complained about working with extinction in the broad G bandpass; this required a structural change to his code. (He mentioned also a high-resolution spectroscopic survey of transiting planet hosts, that could be relevant to us). Morton also brought up the interesting and fun issue that if you have a sample of transiting planet hosts, it is hard to get a control sample, because in no sense does “no transiting planet” mean &ldquo:has no planet”. This is worth more discussion! Angus (Columbia) talked about her work on rotation for stars with parallaxes; she is computing rotational ages. She has looked at the age–dispersion relation and also the ages of stellar pairs, to test age estimators and check that the wide pairs have small ages. In this discussion, the idea came up to compare photometric rotation periods with spectroscopic v sini measurements. Starkenburg (CCA) talked about RAVE+Gaia data on known stellar cluster members, to observe the kinematic escape of stars. We discussed a possible self-consistent model, accounting for the observed stellar mass. Anderson (CCA) spoke about photometric twins. These should be spectroscopic twins, asteroseismic twins, and (as Angus pointed out) even photometric light-curve twins! Leigh (AMNH) talked about numerical scattering experiments he is doing to look at the hypervelocity stars coming from the Galactic Center. He is comparing competing models for the scattering mechanism at the SMBH. Hawkins (Columbia) spoke about his hierarchical model of the population of red clump stars. He can show that they are standard candles to 0.08-ish mag, depending on which sample he chooses. His hierarchical model de-noises the parallaxes and makes very strong predictions for future Gaia data releases!

In the cosmology group meeting, Francisco Villaescusa-Navarro (CCA) spoke about HI gas and intensity mapping, in the context of large-scale structure with the SKA. There are strong trade-offs between resolution for angular and line-of-sight measures of the BAO. We discussed how to suppress the noise and foregrounds in practice for future measurements. He gets percent-level expectations for the Hubble Law at redshifts 0.5<z<2.5, which is promising, but will it satisfy our demands? That might be beat by Euclid! But he is considering only radial modes, not transverse modes. The discussion included lots of talk about beams, foregrounds, estimators, deconvolution, calibration, ancillary data, and self-calibration, with heavy participation by Kris Sigurdson (UBC).


the bento-box MCMC sampler

In a low-research day I did get in a conversation with NYU student Neelang Parghi about an old idea (code-named “bento box”) I have had (with my MCMC collaborators) about an MCMC method that would identify multi-modality, split the space, and recursively sample simpler and simpler problems until it is sampling much smaller, simpler problems that are technically trivial. Parghi is going to try to make this happen as part of his Computational Physics class term project.


cosmology with imperfect supernova data

Alex Malz (NYU), Fed Bianco (NYU) and I discussed a possible hierarchical model for doing supernova cosmology in the face of uncertain supernova classifications and start times, which will be generic in time-domain surveys that are sparse or not designed with supernovae in mind. We started with a pretty complicated graphical model, but we were able to pare it down and then a bit more, and finally got to something awesomely simple: If we represent all supernova types with a bag of templates, and each has some prior pdf for peak brightness, then we can simultaneously model the cosmological parameters, the type probabilities, the peak-brightness distributions, and the photometry of every supernova, no matter what the bandpasses and cadences. The cool thing is that this project can (almost) be assembled from off-the-shelf parts, in the form of sub-systems of other supernova projects. And the project should create new opportunities for projects and legacy data sets.


my practice is unethical?

Today begins a few-day vacation, but I was still working in the morning: Before we left the undisclosed location of the #dsesummit, I had a conversation with Ariel Rokem (Berkeley) and Josh Greenberg (Sloan Foundation) and Jake Vanderplas (UW) and others about developing and writing out in the open: This behavior is what I do, but it is antithetical to double-blind reviewing and other kinds of referee privilege. Since those models (and especially double-blind) are ethical models (that is, they are predicated on a set of ethical principles), could it be that my work-in-the-open practice is thereby unethical? I had this to think about on the bus back to NYC.

While not wracked with existential dread on the bus, I wrote notes and issues for our paper on Hack Weeks. There is so much to say in this document, and it has several audiences.


#dsesummit, day 2

My day started with a long breakfast conversation with Yann LeCun (NYU) about adversarial methods in deep learning. In these methods, a generator and discriminator are trained simultaneously, and against one another. It is a great method for finding or describing complex density functions in high dimensions, and people in the business have high hopes. In particular, it is crushing in image applications. We discussed the problem that is currently on my mind, which is modeling the color–magnitude diagram of stars in Gaia, using one of these adversarial systems, plus a good noise model for the parallaxes. I would love to do that, and it should be much easier than the image problems, because the data are much lower in dimensionality.

I ran a very amusing session at the Summit, in which we had participants bring figures and we crowd-sourced a reaction, critique, and to-do list for each of them. We looked at a figure from politics from Michael Gill (NYU), making a causal claim about regulations and how meeting minutes are kept, a figure from geophysics from Nicholas Swanson-Hysell (Berkeley) showing the data and a model for polar wander, and a figure from neuroscience from Bijan Pesaran (NYU) showing brain region classifications. The feedback from the group was great and useful and constructive (though not always polite; my apologies!). One theme of our discussion ended up being consistency across figure elements. I feel like this crowd-sourcing session was a model for future sessions; it would even be fun to make this a regular event in some forum in NYC.

There was a lot of non-research today, but in the remainder of my research time, I worked on outline material for our growing paper on Hack Weeks.


#dsesummit, day 1

I'm at the Moore-Sloan Data Science Environments annual summit. Much of what we have been doing doesn't exactly count as research, by my (constantly weakening) standards. However, there was an absolutely great and wide-ranging discussion of Hack Weeks and Sprints and their role in education and scientific investigation. This led to a group of us committing to start a paper on the subject (not a white paper, but a paper). The just-started draft is here, and we accept pull requests.

There were some great lightning talks at dinner time. My personal favorite was Kellie Ottoboni (Berkeley) talking about the finiteness of the state space of random number generators. She (with Stark and Rivest) is looking at the possibility that there are random number generators possible with an infinite state space, capitalizing on the ideas around cryptographic hash functions. She sowed some (deserved) fear about using a 32-bit random-number generator in combinatoric contexts. Since our own emcee makes combinatoric choices, this could conceivably be relevant to our master branch!


#GaiaSprint, day 5

Today was the final day for the Sprint, and included an incredible wrap-up. My best way to communicate the awesome is just to link out to the final wrap-up slides, which we all edited simultaneously. Each participant was permitted one slide, and we worked through the full crowd (one presentation each, and questions) in a few hours. Amazing things were accomplished this week, and I anticipate multiple papers submitted to the refereed literature. My own work was on vertical heating of the Milky Way disk, measurement of the disk mid-plane location and tilt (yes, I think we have a result), and the metallicities of co-moving star pairs. The day ended with a short talk by Jim Simons (Simons) who told us about his plans for the CCA and his other centers for computational science.


#GaiaSprint, day 4

Today was another impressive day at the Sprint. Jonathan Bird (Vanderbilt) got together a break-out session to talk about low-hanging projects in Gaia DR1 that no-one is currently doing, just to record ideas and inspire conversation. That led to this impressive list! Not everything on that list is low-hanging (and not everything in this telegraphic document is really comprehensible), but there are lots of Gaia projects that could be done right now.

Meanwhile, Adrian Price-Whelan (Princeton) noticed that thousands (yes, thousands) of the co-moving stellar pairs found by Semyeong Oh (Princeton) and us have both members observed by RAVE-on. He started making plots of their differences in velocity and abundances. It looks like there are some interlopers (more than we expect from naive contamination estimates), but a big core of pairs that have both identical velocities and identical abundances. Exciting! Now if only we can convince Keith Hawkins (Columbia) to measure detailed abundances...?

In the afternoon, Jackie Faherty (AMNH), David Rodriguez (AMNH), and Brian Abbott (AMNH) came to show us a visualization tool with the TGAS data uploaded. The most fun visualization was the one that runs the clock forwards and backwards on the proper motions! They are also looking forward to putting Gaia data on the dome of the Rose Center Planetarium!

In the evening check-in, there were some impressive results. Doug Finkbeiner (Harvard) showed us his pip-installable and software-operable tools (built with Greg Green) to access the 3-d dust map built from the PanSTARRS data. Jason Sanders (Cambridge) compared age–velocity relationships expected from toy models with that observed in the TGAS+RAVE data, where he estimates ages using isochrone fitting and photometry. He finds heating at very short ages, which is apparently not surprising. Dan Foreman-Mackey (UW) showed fits that he and Tim Morton (Princeton) have been doing to get better parameters for exoplanet host stars and the input catalog to the Kepler mission. They are literally doing the entire input catalog, because this is necessary for populations studies. One thing they find is that some conclusions about planet insolation (think: habitability) will change in this era of Gaia.

I mentioned PanSTARRS above, but I should note that Finkbeiner could not actually work on the PanSTARRS data at the Gaia Sprint, because we had rules about open-ness and data sharing, which you can read on the meeting page. I can't adequately say just how appreciative we all are of the Gaia DPAC teams for making their data public. I should also say how appreciative we all are of the other surveys and collaborations and tool builders who make their data and software public for us all to use. Of course the data and tool releasers benefit from these releases enormously, but these releases also require a certain level of bravery, honesty, and time commitment; it isn't easy.


#GaiaSprint, day 3

(As usual, these blog notes are only biased, imperfect, personal highlights. They are not minutes of the meeting in any sense!) Anthony Brown (Leiden) kicked off the day by comparing the all-sky image of the Gaia TGAS catalog with the all-sky image of the stars that Gaia uses to set its attitude. This latter catalog is close to a random sampling of stars, so it makes a beautiful all-sky image.

Yesterday's check-in meeting continued this morning with Bovy showing the Oort constants. He claimed that he needed something to do while his data files unzipped, so he decided to measure the Oort constants, including constant C, which he claims has never really been measured before! This continues the theme of the awesomeness of the Gaia data: You measure things that have never before been possible while your files are unzipping. Bovy also gave us a tiny reminder of what the Oort constants are. Years ago, Bovy and I (more-or-less) failed to measure these constants in the SDSS data.

Daniel Michalik (Lund) came in by phone to tell us about the construction of the TGAS Catalog, and Alcione Mora (ESAC) told us about the Gaia Archive and how to use it. In Michalik's talk I was reminded that there are two small circles on the sky (small as in not great) where there will be close to 200 observations per star; these are great places to concentrate observing programs: Why wait to after Gaia to do the follow-up observing on the amazing time-domain astrophysics that will be discovered in those sky regions.

I spent my sprinting time working with Price-Whelan on the mid-plane of the Milky Way disk, with Bird on the age-velocity relationship, including a generative model for the ages, and with Ness on the causal relationships between metallicity, age, and vertical kinematics. On the latter, the quesion is: Is heating “caused” by age or by metallicity? (Or maybe some more sophisticated question than that.) The answer seems to be that in some parts of abundance space it is clearly age, and in others it is clearly metallicity. I hope this holds up!

At the evening check-in session, Ruth Angus (Columbia) showed that, of Semyeong Oh's comoving pairs of stars that both have gyrochronology ages, they seem (usually) to show the same-ish age. It is early, but it looks like a possible confirmation of the effectiveness of the gyrochronology, possibly in parts of the H-R diagram where it hasn't been well tested previously.

After that, Vasily Belokurov (Cambridge) blew us all away by punking the Gaia DR1 uncertainty model to find time-variable sources in the billion-star catalog. He then found a bridge of variable stars connecting the LMC to the SMC! That made me afraid, very afraid.


#GaiaSprint, day 2

It is only Tuesday, and yet there are already incredible results flowing in from the Gaia Sprint. I won't do justice in any way to what I saw today, but here are a few very personal highlights from the day. Although everyone spent the day working—the Sprint has almost no formal program—these results are from the morning and evening check-in discussions:

In the morning check-in, Branimir Sesar (MPIA) showed us the results of a hierarchical model of the RR Lyrae stars in TGAS, where he simultaneously fit for the period–luminosity relation parameters, and also parameters of a flexible model for the noise (bias and variance) in the Gaia parallaxes. He confirms the Gaia noise model and gets absolutely beautiful parameter constraints. That was a pretty good result for one day of work!

Sven Buder (MPIA) and Johanna Coronado (MPIA) used dynamical actions computed with the help of Wilma Trick (MPIA) and Jo Bovy (Toronto) to investigate the heating mechanisms in the Milky Way disk. They can clearly show that stars that are older (at least according to spectroscopic parameters and stellar models) have larger vertical actions. This was nice, but they have really beautiful gradients in vertical action across the red clump, consistent with the expected gradient in age across the red clump. This suggests that their stellar labels (from GALAH and LAMOST, respectively) and stellar models and Gaia kinematics are all consistent. Crazy! And beautiful. Is the vertical action the new age estimator?

Semyeong Oh (Princeton) showed her results (with also Price-Whelan and Spergel and me) on co-moving pairs of stars, and their locations in the color-magnitude diagram. This led to a lot of discussion about what can be concluded and what can be predicted. In particular, we expect no old stars (like no red giants, even) for the widest-separation co-moving pairs.

Sergey Koposov (Cambridge) showed us results from an insane and massive project to measure proper motions from the comparison of the SDSS imaging to the Gaia billion-source list. This project involved a complete recalibration of SDSS astrometry! His proper motions look great, and he is using them to search for substructure and analyze Milky Way structure. A simply insane project. All the more insane, because his catalog will be superseded by Gaia at the next data release in a year! And I mean “insane” in the best possible way.

With a tiny bit of consulting from me, Jonathan Bird (Vanderbilt) converted his project to measure the age–velocity relation (the vertical velocity dispersion in the disk as a function of stellar age) into a generative model for the ages. This isn't working yet, but he showed results for the relation when he assumes that the ages are God's truth. This project is one he was working on for a long time with GCS data, but with the TGAS data he obviated all his previous work in one single day. Damn, I love good data.


#GaiaSprint, day 1

Today was the first day of the NYC Gaia Sprint, with 50 participants from around the world. I had an absolutely great research day. The meeting began with a set of pitches, one per participant, that included an introduction, a statement of expertise (what that participant brings to the meeting), and a statement of goals (what that participant hopes to take home from the meeting). Pitches were all over the place: Milky Way disk and halo, testing stellar models, exoplanet science, calibration, target selection, future missions, you name it! This session took two hours. But that pitch session was the entirety of the formal program of the 45-hour meeting! That is, everyone is just supposed to work from here on. Of course we will have break-out sessions, and informal discussions, check-in and wrap-up sessions, and lots and lots of co-working. But that was it.

I started working with Boris Leistedt (NYU) on modeling a slice of the color-magnitude diagram of stars, to build a data-driven photometric distance indicator (that will beat the parallax for most TGAS stars). I also started working with Adrian Price-Whelan (Princeton) on his discovery (this morning!) that the TGAS Catalog contains the most precise measurement of the Milky Way disk midplane ever. That displaced some of our plans for running the clock back on disrupting binaries and associations.

We had two break-outs, one on likelihood formulations for doing inference with parallaxes, and another on data quality and data issues in the Gaia DR1 data sets. This latter talk was by Anthony Brown (Leiden), who is the chair of the entire Gaia DPAC data processing effort. I learned a huge amount in both of these break-outs about the noise model for the TGAS parallaxes, which I ought to be using in my own inferences.

One thing we have done in this meeting—which is standard practice for me at scientific meetings now—is open a shared, editable web document to record notes. By mid afternoon this document was more than 20 pages long, filled with crowd-sourced notes about pitches, projects, data sets, and software tools. We will preserve and publish these notes after the meeting in an informal form. One of the big outcomes of this meeting could be some standard tools, standard data sets, and advice about how to use these to do reliable science. Thinking about that as we continue to hack on the data.


GALEX, Gaia, and MCMC

Early in the morning, I met with Dun Wang (NYU), Steven Mohammed (Columbia), and David Schiminovich (Columbia) to discuss our GALEX imaging of the Galactic Plane. We gave Wang and Mohammed tasks of writing titles and abstracts for their papers on the subject. Also, Mohammed showed us his exploration of the GALEXTGAS match, which looks like it is filled with good stuff.

In the afternoon, Dan Foreman-Mackey (UW) and I met to discuss exoplanet results, where Foreman-Mackey has new results on multiplicity based on ABC inference. We followed this with parallel work on our Data Analysis Recipes tutorial on MCMC inference. We re-organized some of the content, reduced scope very slightly, and tried to close issues.

I also worked on posterior samplings for star distances, given parallaxes. I am using Simple Monte Carlo, with two techniques, one that works well for high signal-to-noise parallaxes, and one that works well for low signal-to-noise. The issues are very subtle; a uniform-density prior has a lot of very bad properties in parallax space. I got something working and posted a gif on the twitters.


Simple Monte Carlo

I worked in the morning to build a custom (Simple Monte Carlo) sampler that samples the posterior pdf for the true parallax given a noisy parallax measurement and a sensible but interim distance prior. The problem is very ill-behaved; for many (most, even) TGAS stars, there is almost no support for the likelihood under the prior and vice versa. In related news, in the evening, I worked on Adrian Price-Whelan and my paper on a custom (Simple Monte Carlo) sampler that samples orbital parameters for single-line binary-star systems. I raised many issues and edited the text directly.


stellar parameters; machine learning in cosmology

A research-filled day started with a discussion with Vakili about final changes to his paper on centroiding compact sources. We are responding to a constructive and useful referee report. The day ended with me sending a long email out to the #GaiaSprint participants with their homework assignments, some of which are pretty non-trivial!

At stars group meeting, we heard from Tim Morton (Princeton), who has been building a system to get the best possible stellar parameters (radii, densities, distances) for exoplanet (and binary-star) host stars, given all available data. His system is very flexible in what can be used to constrain the system: photometry, spectroscopy, asteroseismology, and astrometry. What I was even more impressed with is its handling of binary stars and more complex hierarchies of stellar systems: You can have some photometry that constrains the sum of the star brightnesses, and other photometry that constrains the difference. And you can fit the binaries fixing the ages and metallicities to agree. That makes his code very useful for unresolved and marginally resolved binaries, which are always a nuisance when you want to fit models.

At cosmology group meeting, Tjitske Starkenburg (CCA) and Lauren Anderson (CCA) reviewed these two papers by Kamdar, Turk, & Brunner about using machine learning to model the outputs of cosmological simulations. The matters of greatest interest to the group were relegated to a short appendix of the first paper! These papers don't directly solve anyone's current problems, but they represent a start for using machine learning in cosmology. We closed the meeting with a discussion about where we might most productively point traditional machine-learning techniques towards unsolved problems in cosmology. Our ideas were about training on simulations but applying to real data: Maybe we could infer the (unobserved, latent) dark-matter properties given the observed galaxy properties. Or maybe we could use ideas from ML to find better statistics (that is, summary statistics from a galaxy survey) for constraining cosmological parameters.


binary stars; deconvolving the color–magnitude diagram

The day started with a long call with Adrian Price-Whelan (Princeton) regarding binary-star systems, their detection and analysis. We discussed the oddities we are finding in Gaia TGAS. This led to a call later in the day with Semyeong Oh (Princeton), who showed us really nice results. We definitely see the Gaia DR1 exposure map in the sky positions of our detected binaries, but this is as expected: We are more sensitive where we have better data! I am starting to think that we have a reliable catalog. Also on that call we found this highly relevant paper from the Hipparcos era.

Later in the day, Dustin Lang (Toronto) and I discussed my project (which is constantly shrinking in scope) to build a data-driven, noise-deconvolved model of the color–magnitude distribution of stars in TGAS. He endorsed the very simple project and had some good ideas. I now think I might actually have a do-able project for the #GaiaSprint. It also fits in to things that Lang and I have been saying for years, about shrinkage and hierarchical thinking.


supernovae in Kepler

Peter Garnavich (Notre Dame) gave our astrophysics seminar about supernovae and other oddities in Kepler and K2 time-domain data. His lightcurves of supernovae start very early and show beautiful features and regularities. He can confidently classify supernovae on the basis of the light curve alone, it looks like to me! He showed some odd things which might be (the elusive) SN 0.1a. A great re-use of #OtherPeoplesData by an admirable #ResearchParasite!


#GaiaSprint brain-storming

I had great Gaia DR1 and Gaia Sprint brain-storming sessions today with Adrian Price-Whelan and Dan Foreman-Mackey. Price-Whelan proposed that we “run back the clock” on the TGAS stars to see if any pairs emerge from disruption events in the past. I commented that if we go to short times in the past, you need neither distances nor radial velocities: The angular motions suffice. It must be that as the time scale gets longer, distances and radial velocities matter more and more; it would be nice to see this come in continuously. We left that as a conceptual to-do, but we worked out a plan that ought to work in general. This project has a lot in common with the Kinematic Consensus projects that Hans-Walter Rix (MPIA) and I have worked on in the past.

I described to Foreman-Mackey some of my ideas about a data-driven model of the color-magnitude diagram. I have this idea of transforming the space to one in which the noise model is Gaussian, but that kind of thing always makes me feel dirty. We discussed the possibility that I could make my model only one-dimensional (in the parallax direction) by cutting the data into tiny color boxes. That's crazy, but the Gaia Sprint is a hack week, where experiments reign and we want to make things work. So maybe I will write that down this weekend.


carbon stars, data-driven models, simulating faint galaxies

In the morning I hosted our weekly stars meeting at the CCA. Jill Knapp (Princeton) came in! She talked to us about carbon stars, why they are interesting, where they are found, and what we might learn about them from Gaia DR1. We proposed that we could get her results pretty fast! Kathryn Johnston (Columbia) talked to us about the use of stellar models to get very precise distances to all kinds of stars, which triggered many conversations. One is whether and how we could make sure we have such technology up and running at the Gaia Sprint in 1.5 weeks. Another is whether we could get percent-level distances to stars without using stellar models. That is part of my evil plans.

In the afternoon I hosted our weekly cosmology meeting. Lauren Anderson (CCA) talked about her matched dark-matter-only and baryonic (SPH) simulations, and how she is unsatisfied with how they understand the faint end of the galaxy population. How can we understand the completeness of the simulations, or get the most out of the galaxies that are at the low-mass end? We discussed this and the conversation edged into halo-occupation territory. I then started saying crazy stuff about machine learning and everyone quietly left the room, claiming other commitments and, oh my, look at the time!


JHU, chemical tagging; binary star relative velocities

I had a great visit to JHU today, with extensive conversations with JHU locals Ali-Haimoud, Wyse, Szalay, Kuntz, Menard, Tollerud, van Velzen, Nataf, and visitor Dalya Baron (TAU). With Wyse I caught up after many years (a decade?); we discussed chemical tagging, radial migration, and stellar spectroscopy. I gave a late-afternoon talk on The Cannon and its use in measuring stellar parameters in an interactive forum. I also had a great chance to talk flat-fielding with the HST calibration team, who are among the world's best calibrators.

Before all that, I did more pair coding with Price-Whelan (Princeton) on the likelihood ratios, and we had a realization: If we are looking for binary stars, we do expect finite, detectable velocity differences as the separation between the pairs of stars gets small. He implemented this before the day was out and the results look very good.


I've got 99 problems, and every one of them is a units conversion

Today I spent some research time pair-coding (yet more) with Price-Whelan the fully marginalized likelihood ratio test we are using for our comoving-stars project. There are subtleties! Every time we look at the code we find bugs and think-os. I am reminded of the point that we generally find bugs continuously and with no sign of slowing until we are happy that the code seems to pass our unit and functional tests, at which point we stop looking. That doesn't mean the code is correct! A corollary: Almost no-one has ever wasted time testing code. It turns out that one of the most troublesome parts of this project is units conversion. Not surprising, really, when we have AU, pc, km/s, mas/yr, the inverse squared of these (in inverse variance matrices) and many, many more.


stars at low resolution, pair coding, accretion

Today Kathryn Johnston (Columbia) convened the first of (we hope) many Local-Group group meetings up at Columbia, with people present from NYU, CCA, Columbia, and Princeton. Johnston reported on things she learned at the recent meeting in Paris. Of particular interest to her—and everyone, apparently—was work by Yuan-Sen Ting (Harvard), building on experiments by Anna Y. Q. Ho (Caltech), showing that it is possible to get detailed stellar abundances, even without huge covariances, out of low-resolution spectra. The point is that as resolution decreases, the information per line gets worse, but you also (usually) get more spectral coverage, and this (mostly) compensates. This could have a huge impact on the future of stellar astrophysics.

I spent time today pair-coding (with Adrian Price-Whelan, Princeton) the analytic marginalized likelihood that Semyeong Oh (Princeton) and Price-Whelan and I have been working on. We found a couple bugs and by the end of the screen-sharing video call (yes, that's the way we do it), we had a marginalized likelihood ratio that seems to be delivering very good answers, and fast! Very excited.

The research day ended with a great astrophysics seminar at NYU by Zoltan Haiman (Columbia, NYU) about fast growth of black holes in the early Universe. He has found a spherically symmetric, steady-state, achievable accretion process that is (much) faster than Eddington, using the same assumptions (essentially). I need to think about it and understand it better. The Eddington limit is one of the most secure, robust, and well-tested arguments in all of astrophysics!


radio stars, comoving stars, orbiting stars

In the morning I met with Kelle Cruz (CUNY) and Ellie Schwab (CUNY), to discuss the statistics component of their project to measure and model radio emission from brown dwarf stars. We worked through a mixture model, in which some are emitting in the radio and some aren't, and how we could do inference in that model.

Having yesterday written math for the comoving stars paper with Adrian Price-Whelan (Princeton) and Semyeoung Oh (Princeton), today I wrote a draft title and abstract. My view is that projects should more-or-less start with a title and abstract, in part because these are the most important parts of the paper, and in part because it helps guide work towards the true critical path.

The research day ended with a Physics Colloquium by Andrea Ghez (UCLA). She talked about the stellar orbits in the Galactic Center and their demonstration of the existence of a black hole there. She showed that (in principle) the black hole was discovered in the 1980s, but the discoverers were very circumspect and conservative. There are lots of remaining puzzles and projects, with existing data and new data. As I said yesterday, this is a very fruitful context for thinking about new engineering challenges.


we do it all: engineering, stars, galaxies

Today was a good research day at the CCA. The day started with Adrian Price-Whelan (Princeton) and I arguing about the cleanest notation in which to cast our complete-the-squares math that is relevant to our wide-binary marginalization efforts. Once we decided, I went off to write LaTeX, and Price-Whelan and Semyeoung Oh (Princeton) went off to pair-code it.

In our stars group meeting, Andrea Ghez (UCLA) dropped in; we talked about engineering improvements that could make the observations they do of the (time-variable, crowded) Galactic Center much more productive and precise. These ranged from adaptive coronography (something I would love to think about) to data-analysis methods that can infer the properties of sources too faint or too crowded to individually measure at high precision. Oddly, there is a comedy of the commons in which technical advances we want for exoplanet research would almost all be useful also in the Galactic Center.

Also in the stars meeting, we put Semyeoung Oh on the spot, getting her to visualize what we know about widely separated pairs of comoving stars in the Gaia DR1 TGAS sample. She was able to show us that at least some of our widely separated pairs are members of young, open clusters. She was also able to show us that the photometric properties of the stars are consistent with the stars being young and the pairs being short-lived. It was an extremely impressive session, because everyone in the room was shouting out changes they wanted to see in the notebook, and she just calmly executed.

In the Blanton–Hogg cosmology group meeting, we talked about AS4 proposals—the proposals for what to do after the end of SDSS-IV. Most of these are about stars, but there are some about spatially resolving more galaxies. We discussed a bit what we expect from this process. After that, MJ Vakili (NYU) took us through the definition of assembly bias, and his work to show that this effect is likely present—but at a low level—in the SDSS galaxy samples. That led to a more general discussion of the occupation of galaxies in the dark-matter field, which is something I fantasize about working on, from a data-driven perspective. I happened to run into Roman Scoccimarro (NYU) on my way home, and he disabused me of some of my dumbest ideas there.


linear algebra; and spectroscopic parallax

I spent some stolen research time today working out a simple notation for completing the square in the marginalization that Adrian Price-Whelan (Princeton), Semyeong Oh (Princeton), and I are working on for the Gaia DR1 TGAS data. It isn't hard, but you sure have to keep your head screwed on when non-square matrices are flying around, and some matrices have zero or infinite eigenvalues.

Anna Y. Q. Ho (Caltech) and I discussed things she might do at the #GaiaSprint next month. One option would be to figure out how you can infer parallax from spectrum, or spectrum from parallax. The big issue for naive approaches is that the distance or absolute magnitude uncertainties are asymmetric (think Lutz-Kelker bias and all that), but parallax uncertainties are symmetric. I suggested that we could work in the inverse-square-root-luminosity space (yes, insane) for modeling purposes and see if that helps? We would also want to use the extension of The Cannon built by Christina Eilers (MPIA) this past summer, to deal with uncertainties in the labels.


stars that were born together

I spent research time today working through the first draft on a paper by Melissa Ness (MPIA) about the chemical homogeneity of clusters of stars. She is using very close stars in abundance space to look at what we can say about stars that are (and aren't) born together. I also spoke with Adrian Price-Whelan (Princeton) about marginalizing the likelihoods for our binary-star and not-binary-star hypotheses in Gaia DR1 TGAS. After some terrifying experiments over the weekend with numerical marginalizations, we decided that we have to bite the bullet and do the analytic marginalization, which requires completing the square. We both agreed to write down math.

At lunch time I gave the Brown-Bag talk at the CCPP. I spoke about Ness's work, and also about Price-Whelan's work. I see these things as related, because Ness finds stars that are clearly co-eval in chemical space, while Price-Whelan finds stars that are clearly co-eval in phase space.


fitting spectroscopic systematics and marginalizing out #GaiaDR1

In the morning, I discussed new NYU graduate student Jason Cao's project to generalize The Cannon to fit for radial-velocity offsets and line-spread function variations at test time. This involves generalizing the model, but in a way that doesn't make anything much more computationally complex.

In the afternoon, I had a realization that we probably can compute fully marginalized likelihoods for the wide-separation binary problem in Gaia DR1 TGAS. The idea is that if we treat the velocity distribution as Gaussian, and the proper-motion errors as Gaussian, then at fixed true distance there is an analytic velocity integral. That reduces the marginalization to only two non-analytic dimensions (the true distances to the two stars). I started to work out the math and then foundered on the rocks of completing the square in the case of non-square matrix algebra. No problem really; we have succeeded before (in our K2 work).


binaries, velocities, Gaia

Early in the day, I discussed with Hans-Walter Rix (MPIA) the wide-separation binaries that Adrian Price-Whelan (Princeton) and I are finding in the Gaia DR1 data. He expressed some skepticism: Are we sure that such pairs can't be produced spuriously by the pipelines or systematic errors? That's important to check; no need to hurry out a wrong paper!

Late in the day, I had two tiny, eensy breakthroughs: In the first, I figured out that Price-Whelan and I can cast our binary discovery project in terms of a ratio of tractable marginalized likelihoods. That would be fun, and it would constitute a (relatively) responsible use of the (noisy) parallax information. In the second, I was able to confirm (by experimental coding) the (annoyingly correct) intuition of Dan Foreman-Mackey (UW) that the linearized spectral shift is not precise enough for our extreme-precision radial-velocity needs. So I have to do full-up redshifting of everything.


group meetings

At my morning group meeting, Will Farr (Birmingham) told us about CARMA models and their use in stellar radial velocity analysis. His view is that they are a possible basis or method for looking (coarsely) at asteroseismology. That meshes well with things we have been talking about at NYU about doing Gaussian Processes with kernels that are non-trivial in the frequency domain to identify asteroseismic modes.

In the afternoon group meeting, we had a very wide-ranging conversation, about possible future work on CMB foregrounds, about using shrinkage priors to improve noisy measurements of SZ clusters and other low signal-to-noise objects, We also discussed the recent Dragonfly discovery of a very low surface-brightness galaxy, and whether it presents a challenge for cosmological models.


data-driven models of images and stars

Today was a low-research day! That said, I had two phone conversations of great value. The first was with Andy Casey (Cambridge), about possibly building a fully data-driven model of stars that goes way beyond The Cannon, using the Gaia data as labels, and de-noising the Gaia data themselves. I am trying to conceptualize a project for the upcoming #GaiaSprint.

I also had a great phone conference with Dun Wang (NYU), Dan Foreman-Mackey (UW), and Bernhard Schölkopf (MPI-IS) about image differencing, or Wang's new version of it, that has been so successful in Kepler data. We talked about the regimes in which it would fail, and vowed to test these in writing the paper. In traditional image differencing, you use the past images to make a reference image, and you use the present image to determine pointing, rotation, and PSF adjustments. In Wang's version, you use the past images to determine regression coefficients, and you use the present image to predict itself, using those regression coefficients. That's odd, but not all that different if you view it from far enough away. We have writing to do!


measuring and modeling radial velocities

Dan Foreman-Mackey (UW) appeared for a few days in New York City. I had various conversations with him, including one in which I sanity-checked my data-driven model for radial velocities. He was suspicious that I can take the first-order (linear) approximation on the velocities. I said that they are a thousandth of a pixel! He still was suspicious. I also discussed with him the point of—and the mathematical basis underlying—the project we have with Adrian Price-Whelan (Princeton) on inferring companion orbits from stellar radial-velocity data. He agrees with me that we have a point in doing this project despite its unbelievably limited scope! Remotely, I worked a bit more on the wide-separation binaries in Gaia DR1 with Price-Whelan.


data-driven radial velocities

In my weekend research time, I worked out a fully data-driven method for measuring radial velocities in an extreme-precision (or even normal-precision) spectroscopic survey. The idea is to simultaneously fit for the spectrum of the star and its radial-velocity offset; you need multiple epochs of observations to get both (at least at high signal-to-noise). Because the model is fully data-driven, it won't give absolute radial velocities; it will only give relative velocities. That's always the cost of being data driven—the loss of interpretability.

I also added to the model some flexibility to capture spectral variations with time, especially those that might project onto the radial-velocity direction or measurement. That would permit us to discover and characterize spectral changes that co-vary with surface radial-velocity perturbations or jitter. I am trying to write something down that would be practical to apply to HARPS data, but I'm all theory right now.


Gaia thinking

I continued to think about and write about Gaia DR1 projects today. In particular, I tried to write down a responsible way to measure the standardness of standard stars, given noisy parallaxes. I also tried to understand whether we have a scope and interesting-enough results on wide-separation binary stars to merit a paper.


cross-correlation, Disco, Gaia, and life

My day started with a discussion of determination of stellar radial velocities by cross-correlation with a weighted mask (as is done in the HARPS pipeline) with Megan Bedell (Chicago). We talked about the subtleties of doing this, when there are partial-pixel shifts and we want answers that are continuous.

There was a substantial phone call today organized by Jonathan Bird (Vanderbilt) to talk about the After-SDSS-IV proposal for use of the 2.5-m telescope and its instruments. We are working on a proposal called Disco to do dense sampling of the Milky Way disk (looking in the infrared through the dust).

By text message, Adrian Price-Whelan (Princeton) and I tentatively decided that we would pursue a paper about wide binaries with the Gaia DR1 T-GAS data that we were exploring yesterday. I really hope we have a paper to write, because it would be fun to be in the first set of papers. Of course Gaia DR1 papers appeared on the arXiv already tonight!

At the end of the day, Sean Solomon (Columbia) gave the Departmental Colloquium, about Mercury and the Messenger mission. It was a great talk, showing that there is water and volatiles on Mercury, as not expected in naive models. At the end, Jasna Brujic (NYU) asked him about life on Mercury and elsewhere in the Solar System. He described the evidence that rocks are thrown from planet to planet and expressed the view (also held by me) that it is quite likely that there is life elsewhere in the Solar System. That made me happy!


#GaiaDR1 zero-day

Today (at 06:30 New York time) Gaia released it's DR1 data, and in particular the T-GAS sample of stars with five-parameter solutions and photometry. What a great day it was! I assembled with Kathryn Johnston (Columbia), David Spergel (Princeton), Adrian Price-Whelan (Princeton), Ruth Angus (Columbia), Keith Hawkins (Columbia), and others to get, play with, and make figures from the new data. Many amusing things happened, and this blog post will not capture them all.

Hawkins immediately plotted the velocity distribution of disk stars in the U-V plane, using the overlap between T-GAS and RAVE. He confirms the velocity structure Bovy, Roweis, and I predicted based on (clever, if I say so myself) de-projections of the Hipparcos data. Right as we were looking at this, Bovy tweeted the same thing. Hawkins has access to our RAVE-on data with detailed abundances, so he can show that the velocity structures are chemically inhomogeneous; the questions that are easy to ask are: Are they all inhomogeneous in the same ways, or are there differences? And can we see any spatial dependence within T-GAS of the velocity structure? He moved on to looking at the candle-standardness of the red clump.

Andy Casey (Cambridge), working remotely, made temperature-magnitude diagrams for the RAVE-on sample. I asked him to show what happens as you harden the cut on the parallax signal-to-noise (parallax over parallax uncertainty). He tweeted the answer. It really looks like we might be able to use Gaia to build a completely data-driven model of all aspects of stars.

Price-Whelan and I looked at various things. We started by trying to see if there is vertical velocity structure in the nearby disk that might show evidence for disk warping, or horizontal velocity structure that might look like spiral arm perturbations. The figures are confusing! There seems to be a very cold bubble around the Sun in the Galactocentric U velocity, which is odd. After spending lots of time confused about that, we looked for very wide separation binary stars, and we see lots! Indeed, it looks like we have evidence for binaries with separations larger than 1 pc! That's worth following up, especially if we have any overlapping spectra. Finally, Price-Whelan also showed that the Kepler-identified transiting exoplanet host stars are all on disk orbits; that is, we don't have (yet) any halo exoplanets. But these are early days!

That's just a tiny slice of the things we started to think about and play with. It is the beginning of a new era. Thank you to the Gaia Mission and all the people who gave years of their professional and scientific lives to this project.


a tiny bit of Gaia

Today was taken out by teaching and the job season, but in my tiny bit of research time, I worked on what I am going to do tomorrow with the Gaia DR1 data. That got me on the phone to Adrian Price-Whelan. We talked about Gaia and also our binary-star sampler.