Dr Pearson

It was my great pleasure to sit on the PhD defense committee for the successful defense of Sarah Pearson. She wrote a thesis about low-mass galaxies and globular clusters, considering both their interactions with each other, and with the bigger galaxies into which they later fall. She has some nice analyses of the Palomar 5 tidal stream, and what it's morphology might tell us about the Milky Way halo and bar. And also nice results on gas bridges and streams around pairs of dwarf galaxies.

I was most interested in her stellar-stream results, including several things I hadn't thought about before: One is that prograde streams are more affected by the bar and spiral arms in the disk than retrograde streams. Another is that we might be able to find globular-cluster streams around other galaxies nearby. That would be incredible! And since (as she showed) you can learn a lot about a galaxy just from the shape of a stream, we might not need to do much more than detect streams around other galaxies to learn a lot. It was a pleasure to serve on the committee, and it is a beautiful body of work.


#TASI, day 9

Today was my last day and fifth lecture at TASI. This lecture was crowd-sourced in content! I spoke about Fisher information, linear algebra tips and tricks, and decision theory and model selection. On the latter I strongly advocated engineering methods like cross-validation!

Over lunch I had a great set of conversations with Zach Berta-Thompson about precise measurement for exoplanets, and also hack weeks like the #GaiaSprint. We went deep into the limits on ultra-precise photometry from the ground. We wondered at the point that the best imaging systems get the best precision (on photometry, of point sources) by de-focusing. That has always struck me as somehow absurd, though it's true that you don't have to understand your system nearly so well when you are out of focus (for many reasons).

We had one very good idea: Instead of de-focusing, put in an objective prism! You could get many of the benefits of de-focus but also get far more information about the atmosphere and speckle and scintillation and so on. In principle, you might beat the best measurements made to date. And it is a cheap experiment to perform.


#TASI, day 8

Today my lecture was crowd-sourced! In response to popular opinions from the students, I spoke about cosmological large-scale structure experiments. I spoke about how the large surveys are collapsed to symmetry-respecting mean, variance, and three-point functions, and how simulations of large-scale structure are used to build surrogate likelihood functions for these summary statistics.


#TASI, day 7

Today was my second day of lecturing at TASI, and I gave one (morning) lecture, on the use of MCMC sampling. In the afternoon, I looked at (for the first time) the GALAH data on detailed element abundances of stars. I looked at the question of whether the chemical abundances could be used to predict the Galactocentric radii. The idea is: If the gas involved in star formation is azimuthally mixed, there ought to be relationships between radius in the disk and chemical abundances. They didn't jump out! I have various ideas about why, but for now this will be back-burner.


#TASI, day 6

Today was the sixth day (but my first day) of the Theory Advanced Study Institute summer school at CU Boulder. I gave two 75-minute lectures on data analysis, my first two of five lectures this week. In the first, I tried to boil down data-analysis to a set of over-arching principles. I got 8 principles. Maybe this is the introduction to the book I will never write! In the second lecture I spoke about fitting a model, from a frequentist perspective, but with a focus on the likelihood function. I am loving the interactive audience. See the wiki for a (constantly updating) description of the lectures I am giving.


#GaiaSprint, day 5

After a long week (and some great success), all Christina Eilers (MPIA) and I had in us to do today was make the short-term to-do list for our spectroscopic parallax project (which, by the way, Hans-Walter Rix thinks we shouldn't name that way!) and our related Milky-Way mapping project. In my wrap-up slide, I used my two minutes to speak of the conceptual things we learned about linear models and their power.

The wrap-up was given in two separate groups, in parallel. We were forced to this by space and the size of the Sprint. There were many complaints! But if you want to look at the incredible set of wrap-up slides, look here! You will see some amazing things in there. I was blown away, and several participants told me that it was a very important meeting for them. Our explicit (not implicit) goal is to increase the scientific productivity of Gaia and the community of astrophysics that it supports; I very much we hope we succeed in doing that. Today, I am optimistic that we can.

Because we did so many experiments this year, with selection, with splitting the group, with communication, and so on, we learned a lot. We made many mistakes. I hope we can capitalize on these mistakes to learn for future projects, like the next Sprint, and all the other hacking and sprinting and parallel-working things we do.


#GaiaSprint, day 4

It's hard work, this full week of sprinting! Especially following a week of hacking in preparation! I was exhausted today (and I can't entirely blame Andy Casey, though I'd like to). Christina Eilers (MPIA) continued with her map-making work. We started the day by trying to find chemical-abundance neighborhoods (that is, regions of element-abundance-space) where the stars lie on a ring in the Milky-Way disk. There should be such rings if we can measure the abundances well enough! But we failed.

In other news, Andy Casey (Monash) and Adrian Price-Whelan (Princeton) asserted to me that they can take the stars in APOGEE with multi-modal posterior pdfs in orbital-companion space (that we produced here) and rule out some modes just with the Gaia DR2 radial-velocity mean and variance (which is all we get!). I hope this is true.

And in yet other news, David Spergel (Flatiron) and Megan Bedell (Flatiron) not only found co-moving stars in the halo, but find that as the separations get large, the velocity vectors point parallel to (or anti-parallel to) the separation vector between the stars. Duh! Disrupted binaries are two-star streams. I pointed out (to some skepticism) that the velocity differences between very wide pairs of nearly-comoving stars could be used to make local acceleration maps of the Milky-Way halo. Stoked!


#GaiaSprint, day 3

Boris Leistedt (NYU) and I have been talking for a while about a set of subjects related to the point that proper motions and parallaxes are both inversely related to distance, so you can use them to inform one another. This is a covariance induced by the geometry! Today he got this all working, along with a hierarchical inference of the velocity distribution in the Milky-Way halo. It is early days, but it looks like he substantially improves the parallax estimates for most stars. And, importantly, he can produce improved parallax likelihoods not just improved parallax posteriors. That is, they have wider use in downstream inference than, say, the Bailer-Jones et al distances. But still they will be hard to use absolutely correctly.

Andy Casey (Monash) and I discussed a possible a non-parametric model for the radial-velocity scatter delivered in Gaia DR2. This model would compare any star to its neighbors in relevant parameters (like color and apparent magnitude and housekeeping flags) to establish whether it has enough of a RV excess to be considered a likely binary.

Ana Bonaca (Harvard) showed me maps of the Jhelum stellar stream which make it look (to my eye) like a fold caustic! Many moons ago, Scott Tremaine (IAS) asked me if we could find various kinds of catastrophes in the stellar density, and I (and friends) responded with this paper on the cusp catastrophe. Maybe Bonaca has found one, but a fold! (And folds should be more common than cusps.)

Christina Eilers (MPIA) and I temporarily paused our methodological developments on our spectroscopic-parallax project and made maps of the Milky-Way disk. We tried plotting velocities, abundances, and vertical distortions (warps). Getting good visualizations is hard because the APOGEE selection function is so featured. That reminds me of why I am such a big fan of SDSS-V!

Many interesting things were shown in the afternoon check-in, but incredibly Sihao Cheng (JHU) and Sergey Koposov (CMU) found that galaxies appear in the Gaia DR2 data as variable stars! Why? Because the asymmetric Gaia point-spread function projects onto the complex galaxy morphology differently at different s/c orientations. That rocks! In principle the galaxy morphologies could be inferred from the time-variable data...


#GaiaSprint, day 2

Oh yes it worked! Today Christina Eilers (MPIA) clearly got 10-percent parallax precision with her linear, data-driven spectroscopic parallax model, making use of APOGEE data and WISE and Gaia photometry. The model is literally a linear combination of inputs, with a hard regularization and cross-validation to protect against over-fitting. Because our outputs are the cross-validation predictions, every spectroscopic parallax we produce is technically independent of the Gaia training data (although there are some residual correlations etc if you really want to go deep). From Daniel Michalik (ESTEC) we learned a lot about both astrometric and photometric data-quality filtering for the Gaia data, which (we can see in the residuals) will further improve our results!

Because the model is purely linear, we can propagate uncertainties easily, and we can “run it both ways” as it were. We can definitely do better if we go to a nonlinear model, because linearity is such an absurdly difficult constraint. However, it is so beautiful to have a linear relationship, we might stay here for a few papers!

Many incredible results appeared today, but one that struck me is the following: Laura Inno (MPIA) looked at the Cepheid variables in the data, where she has ages and photometric distances and kinematics. She clearly sees a substantial warp in the outer disk. The question came up: Is this the same as the warp in the gas disk?


#GaiaSprint, day 1

Today was the first day of the 2018 NYC Gaia Sprint, with satellite events in Santa Barbara and Seattle. 90 astrophysicists converged on Flatiron to pitch and start hacking. For those who don't know the event, the idea is that it is a working meeting, where participants are asked to move their scientific projects forward and start new ones, and there is (almost) no formal schedule at all. Everything other than the pitch at the beginning and the wrap-up at the end are crowdsourced.

One of the two scheduled hours of the entire week was the introductory pitches today. The pitches ranged across a huge range of topics and interests. The pitch slides are here.

As my loyal reader knows, there are far more projects to do with Gaia DR2 than years left in my life, so I have to choose! I decided, with little consideration, to concentrate on the spectroscopic-parallax project with Eilers (MPIA). This is a tool-building project, with methodological aspects that are interesting, and so it isn't a terrible choice. It also serves the long-term goals of SDSS-V. In service of this project, we matched our sample to the WISE data and removed our Galactic latitude cuts so that the model could automatically capture the dust reddening and extinction. We'll see if that works, tomorrow.

At the evening check-in, some great stuff was shown, especially some extremely odd kinematics of the Milky Way disk, and some population results on binary stars.


regularized regression, self-calibration of spectroscopists

Christina Eilers (MPIA) and I went off the reservation today and implemented a L1-regularized linear regression method for our spectroscopic-parallax project. This permitted us to consider using the full spectrum as a feature vector, and not just derived quantities. That is, it obviated a lot of our feature engineering! But we also discovered a massive bug: We had been using the uncertainties where we should have had our inverse variances! That's been done before. But when we made these changes, we got better results; it looks like we might be able to beat 10-percent distances with a little more work.

Andy Casey (Monash) and Natalie Hinkel (Vanderbilt) showed me the self-calibration results they have from the Hypatia Catalog. The results are beautiful! They have affine translations between labels in one survey into labels in another survey, and de-noised labels for all surveys. It is cool! Much more needs to be done. But a great start, and very promising for answering some of my questions about accuracy and precision.

At my data-group group meeting, each participant had a short time interval to explain a figure they are working on. The range of subjects shown was amazing! And we learned that having everyone talk for a well-defined pre-set short time is better than having a few people talk for an undefined amount of time. That's a win.


CMD as a function of Galactic position

Today, Lauren Anderson (Flatiron) showed me some awesome visualizations of the color-magnitude distribution (CMD) of stars as a function of position in the Galaxy. There are variations! The question is: What part of the variation is due to dust and what part is due to metallicity (or composition) and what is due to star-formation history (or age)? We don't know how to answer these questions! But one thing we did is take some derivatives of the CMD with respect to position. Is this a sensible thing to do? Could we treat the CMD as data and try to fit it with a latent-variable model? That is, what is the right approach to quantifying and interpreting the CMD variations around the Galaxy?


binary stars, spectroscopic parallaxes. planets

Andy Casey (Monash) is in town to work on Gaia DR2 and he has been looking at using the radial-velocity uncertainty (which, in the database, is really an empirical scatter across measurements) to identify binary stars. This is a great idea. I was pitching various ways to calibrate this quantity to make it more reliable and then he reminded me that many binaries have tens of km/s semi-amplitudes! Duh. The signal is super-strong. This is a great #GaiaSprint project!

Christina Eilers (MPIA) and I had success today on spectroscopic parallaxes for stars at the top of the red-giant branch: We are now able to predict absolute luminosities (and therefore parallaxes) with almost 10-percent accuracy! That makes them only slightly worse than red-clump stars, and we think there is more information to exploit in the data. Our method is a bit hacky: We are still using spectroscopic quantities from the APOGEE pipelines and not just the spectra themselves, but it should point the way to a cleaner method soon.

Stars Meeting at Flatiron was a great success. One exciting project in progress is that Ruth Angus (Columbia) is finding relationships between exoplanet occurrence and host-star orbital actions! Now the causal part: Is this because of age or abundances or dynamical interactions? Another is that Ben Montet (Chicago) proposed that we find non-transiting hot-jupiter exoplanets by looking at surface rotation: There is at least weak evidence that stars with hot jupiters spin faster—or appear to. That's exciting as another possible indirect planet-detection technique.


Gaia sprinting

Today was the first day of what's looking like it's going to be a very full week! Andy Casey (Monash) arrived in town, to work on binary stars and self-calibration of stellar parameter pipelines. Natalie Hinkel (Vanderbilt) showed up, to also work on self-calibration, in the context of her Hypatia Catalog. Christina Eilers (MPIA) showed up to work on data-driven approaches to estimating parallaxes from spectroscopy, which is a project we are re-booting from last summer. Alex Malz (NYU) dropped by to discuss issues with inferring redshift distributions from biased photometric redshift measurements. Kate Storey-Fisher (NYU) came to talk about mock catalogs for large-scale structure. And I continued conversations with other group members about their #GaiaSprint projects for next week. By the end of the day, Eilers and I were getting parallax predictions at the 20-ish-percent level, but we need to do a factor of two better!


gravitational clustering, gravitational interferometry

Today Michael Joyce (LPNHE) gave a great talk about analytic and conceptual directions towards understanding nonlinear gravitational growth of structure in the Universe. He focused on the stable-clustering approximation, which dates back to Peebles, is very predictive over a range of scales, and can be used to test simulations. At lunch afterwards, we discussed the great importance of studying gravity analytically, a point made often and well by Roman Scoccimarro (NYU).

Prior to the seminar, Ellie Schwab-Abrams (AMNH) and I discussed self-calibration for pulsar timing arrays, which we think and hope could lead to a new era of gravitational interferometry and enormously increase the sensitivity to long-term gravitational-wave signals. We decided to start by solving the radio-astronomy problem, which has yet to be solved in the literature, because no radio telescope has the problem that the relative velocities of it's elements are unknown!