using Gaia correctly; direct imaging

As my loyal reader knows, Christina Eilers (MPIA) and I have been working on a purely linear model for estimating parallax given APOGEE spectroscopy (and Gaia+2MASS+WISE photometry), for very luminous red giants. This model had some pathologies, which caused Hans-Walter Rix (MPIA) to disagree with us about various aspects of implementation. On my vacation, I figured out how to generalize this model to make it a linear predictor for absolute magnitude (or log luminosity or distance modulus) without breaking the nice properties of the model, to wit: We don't do any cuts on the Gaia parallaxes or parallax signal-to-noise, avoiding selection-induced biases. And we use the Gaia likelihood function correctly.

It worked! We are now predicting distance moduli with a distance precision of better than 10 percent for very luminous giants (more luminous than the red clump). We made some sweet maps of Milky Way disk kinematics (mean velocity and velocity dispersion as a function of location in the disk).

Late in the day, Matthias Samland (MPIA) re-booted our project with Jeroen Bouwman (MPIA) to apply technology like our Kepler CPM calibration to ground-based direct-imaging data from coronographs. We reminded ourselves where we are and made very short-term writing goals to write down what we think we are doing.


bad data; crazy model

Today, Jessica Birky (UCSD) found a weird M-type dwarf spectrum in the APOGEE data. We know it is a M dwarf, because Andrew Mann has an optical spectrum of it and physical parameters. But the APOGEE spectrum has features in all the wrong places; it looks nothing like any of the other M dwarfs we have. By the end of the day we started to suspect that it is a redshift issue, where the pipeline is assigning the wrong radial velocity and redshifting the data incorrectly.

I got in a discussion with Adrian Price-Whelan (Princeton) of an insane idea I had (while hiking) about Schwarzschild modeling: We could build a model out of not complete orbits but small orbit segments. We could deliberately make these segments with inconsistent gravitational potentials. And then when we model the data as a sum of segments, it would select the segments that fit the data best locally. Hence: A non-parametric model of the Milky Way potential, built up of locally fitting but globally inconsistent orbit segments! That's interesting. And probably intractable.

Lauren Anderson (Flatiron) showed me some examples of stellar streams where it appears that the Gaia DR2 RR Lyrae lie kinematically in the stream. That is interesting, because the RR Lyrae would deliver distance information for the streams.


angles mixed?

I spent a few days fully off-grid, hiking. During that time I nonetheless thought and dreamed about some astrophysics projects. Not sure if that's healthy! But hey.

In particular, Hans-Walter Rix (MPIA) and I talked out some possible projects with Gaia DR2 that could make good use of action–angle formalism to constrain properties of the Milky Way. For example, we could look at angle uniformity in action boxels. That requires a selection function, but maybe one is forthcoming? For another, we could look at whether angles predict element abundances at fixed actions? If they do, then either the potential is wrong or the populations are kinematically young. And for another, we could look at point symmetries in velocity space (where selection should be simple) of stars in action boxels. Any asymmetries point to dynamically young populations. All projects to discuss with the team on my return to civilization.


bias–variance trade-off, for parallax estimation

It was a very successful day today! Christina Eilers (MPIA) and I performed a set of external validations on our spectroscopic parallax project and it passed them well: our parallax estimates are more precise than Gaia for the red-giant stars we care about, and they seem to be unbiased when we look at the positions of stellar clusters (open and globular). A fight broke out with Hans-Walter Rix (MPIA), who doesn't like that our spectroscopic estimates of parallax sometimes go negative! But we are trying to build something that can be used as a likelihood for distance, so we want it to have the same kind of unbiased properties that the Gaia parallaxes have. That's leading to some friction on the team!

Fundamentally the issue is this: Do you want the best distance estimates you can get? Or do you want a likelihood function that can be multiplied into other likelihoods to obtain better distances given everything you know? If you want the former, then you might take on a lot of bias to get lower variance (more precision). If you want the latter, then you want unbiased likelihood components that can be multiplied together.

Another important distinction is this: Do you want to use many stars in concert to do things like measure the rotation curve or a metallicity gradient? Or do you just want to know an individual star's position? If the former, then you want unbiased likelihood functions that you can combine. If the latter, then you want to take on bias to increase precision.


Milky Way disk and halo, HabEx, M dwarfs, etc

Ah, back to work again. It is my incredible privilege to work in Heidelberg every summer. Today I spoke with Sara Rezaei Kh (MPIA) and Christina Eilers (MPIA) about projects to use Gaia DR2 to constrain properties of the Milky-Way disk, especially the rotation curve and the dust density as a function of position. That connected to a longer conversation with Lauren Anderson (Flatiron) and Hans-Walter Rix (MPIA) about measuring the properties of stellar populations in boxels of the Milky Way. Boxels in position, or in velocity, or in actions. It also led to some work in which Eilers and I looked at external validation (using open clusters) of our spectroscopic parallaxes.

I also re-started projects on M-type dwarf stars with Jessica Birky (UCSD) who is in HD for the summer. She will write up her results using The Cannon to transfer labels from a small training set fit by Andrew Mann (Columbia) to all of APOGEE if all goes well.

And into town came Daniel Stern (JPL), who gave an incredibly impressive talk about HabEx, the NASA mission concept for the next decadal survey. It is an ambitious mission, but strongly cost controlled. If it is paired with a starshade (an idea I love), it could do amazing exoplanet science. And it really motivates me to get back to thinking about physical optics!

Finally, I spent a couple hours in the back of the room for #StellarHalos18, where I learned about Gaia DR2 projects on the Milky-Way halo. In particular, I learned about the Malhan method for finding streams. It puts high weight on stars with likely co-orbital neighbors, and then uses a by-hand or by-eye step to link them into stream discoveries. Very impressive. Very fast. Very high impact! But a bit too heuristic for my taste; let's automate all the things!


SNe and GRBs

On my first day in Heidelberg, I attended a colloquium talk by Maryam Modjaz (NYU), about exploding stars. She has very nice results on the metallicities of the environments (galaxy hosts) of supernovae and gamma-ray bursts. She can show that the type Ic broad-lined supernovae are different in chemical environments than the type Ic normal supernovae, and she can show that the SNe associated with GRBs are Ic broad-lined. So the non-GRB type Ic broad-lined supernovae are very likely the counterparts of off-axis gamma-ray bursts. The gamma-ray bursts without gamma rays! This is exciting, because it will bolster the model for GRBs and constrain the beaming.


Dr Dun Wang

It was with the greatest pleasure that I participated in the PhD defense today of Dun Wang (NYU), who has been my student this last five years. He has done a remarkable body of work: He has a very good model for the NASA Kepler data, using pixels to predict other pixels. He has a completely novel method for image differencing, where he doesn't need a reference image (and instead uses a time series of images to build a predictive model). And he has a data-driven model for the pointing (as a function of time) and sensitivity map for the last days of the NASA GALEX mission, where the camera was scanned rapidly back and forth across the Galactic Plane.

I have many things to say about this work, but here are just a few: Wang's work encouraged me to think about extremely big models! I think his model of the Kepler data has more free parameters than any model of anything, ever (literally close to a trillion). Gotta love convexity! He used his image differencing to discover completely new microlensing events in the K2 Campaign 9 data. He has the first ever ultraviolet maps of the Milky Way disk plane at this depth and resolution. It is a very impressive body of work.

Congratulations Dr Wang. And thank you!


preparing for defense

My only real research today was a session with Dun Wang (NYU) in preparation for his PhD defense. I encouraged him to talk about less, not more: The defense (at least here) doesn't need to be about everything (that would take hours, anyway); it should be about what you learned, and what was most fun.


non-parametric celestial mechanics

Foreman-Mackey (Flatiron) showed me something interesting today: He is doing full sampling of one-planet and two-planet transiting systems in Kepler but using models that have many more than one or two planets. Where many is like four! But he is learning some very interesting things: One is that you can do this, as long as you let planet radii go to zero. Another is that the ease of sampling depends strongly on the eccentricity prior. That isn't surprising in retrospect.

One of the motivations of this, I think, is to get away from computing Bayesian evidence between different multiplicity models: After all, to compute these ratios, you have to sample a large N, so why not just do that one N once and treat the problem as a parameter-estimation problem, rather than an evidence problem? That's dear to my heart. Another angle that I'm interested in is the following: We know that our own Solar System in fact has thousands to millions of planets; can we deal with it in a more non-parametric way?

Crazy idea: Send N to infinity and fix the planet periods (say) and then see if you can sample in the other orbital parameters. Right now that doesn't seem feasible, but it might be the truly non-parametric approach.


#wetton18, day 3

Today was the hack day associated with #wetton18. It was a great day! I had an incredibly limited goal: As I mentioned, I learned yesterday that some Gaia Bp Rp spectra (the low-resolution spectrophotometry) have been released with the Gaia transient alerts. They are uncalibrated, and possibly heavily affected by systematics, but there are many thousands of them! So my goal was to just plot some of these spectra.

What a success this was! Once I realized (and announced) that the project involves scraping data from web pages, Brigitta Sipocz (Cambridge) immediately volunteered to help. She (incredibly quickly) built a tool that scrapes the raw Gaia data from the alerts pages, refactors it into a correctly formatted astropy table, and writes it out as a fits file, structured so that multiple scrapings can be concatenated into a larger table.

We made the visualization shown in this tweet. That shows a blue star that is fading. Because it is a relatively normal star (that is, not a supernova), maybe we could use it, and others like it, to build some kind of model of the Bp Rp spectra. Our code is here.


#wetton18, day 2

The Wetton Workshop opened today with amazing talks by Udalski and Wyrzykowski about the OGLE project and data. It is truly incredible what has been achieved in this survey, which was designed with a very forward-looking goal of detecting microlensing by compact objects in the dark sector. The project detected all kinds of other expected and unexpected time-domain phenomena. These talks were followed by Alexander Scholz (St Andrews) providing some philosophical basis for looking for and at anomalies in data streams. He gave the good advice (and OGLE is a great example of this) to look at timescales or wavelengths or precisions where no-one has looked before. Hear, hear! (He is also the lead of the WETI project, of which I am a big fan.)

There were too many things that I loved today; I can't list them all here! But one personal highlight was an exciting talk by Thomas Wevers about the Gaia alerts system, which is putting Gaia data on-line in real time when stars vary strongly, or when new sources appear on the sky. It produces a few alerts a day, and the data dump includes the epoch photometry and the raw Bp-Rp low-resolution spectra! This got me extremely excited: I haven't seen any Bp-Rp spectra yet, and there are now thousands online. I resolved to look at them asap. Wevers warned us that the spectra are not calibrated in any sense: Not in wavelength or in photometrically.


#wetton18, day 1

Today was the first day of the Wetton Workshop at Oxford. There were many interesting talks from all over the map, but with a goal at understanding how we make sure that we stay open to unexpected discoveries, even as we make more and more targeted data sets and experiments. One theme that emerged is that of systematics: As you push data harder and harder&mdashin cosmology or exoplanet search or anything else—you become more and more sensitive to the details of your hardware and electronics and selection and so on. This led to a discussion of end-to-end simulation of data sets to understand how hardware issues enter and to see if we understand the hadware.

That's important! But I think there is an equally important aspect to this: If we don't take our data with sufficient heterogeneity, we can't learn certain things. For example, if you take all LSST exposures at 15 seconds, you never test the shutter, never test linearity of the detector, never find out on what time scales the PSF is changing, and so on. For another, if you take all the Euclid imaging survey on a regular grid, you never get cross-calibration information from one part of the detector to another, nor can you find certain kinds of anisotropies in the detector or the point-spread function. If we are going to saturate the bounds, we are going to need to take science data in many, many configurations.

Here are the slides from the public talk I gave at the end of the day. Note my digs at press-release artists' conceptions. I think we should be honest about what we do and don't know!


public-talk slides

I traveled to Oxford today for #wetton18. As part of this meeting, I am giving a public talk in Oxford. I spent the time on the plane I should have spent sleeping making slides. I think the hard thing about a public talk is always level; there is always a diverse audience with very different interests and backgrounds. The talk I made—about finding planets—requires some sophisticated reasoning to make sense. I fear that I have bitten off too much.

One thing I will definitely put into the talk is some material about the limitations of our knowledge in astrophysics. It comes from the point that many things can't be independently confirmed, especially when they are at the limits of our observing capabilities. It is a bit hard to present this without sounding like we don't believe anything. That's a challenge.


Dr Pearson

It was my great pleasure to sit on the PhD defense committee for the successful defense of Sarah Pearson. She wrote a thesis about low-mass galaxies and globular clusters, considering both their interactions with each other, and with the bigger galaxies into which they later fall. She has some nice analyses of the Palomar 5 tidal stream, and what it's morphology might tell us about the Milky Way halo and bar. And also nice results on gas bridges and streams around pairs of dwarf galaxies.

I was most interested in her stellar-stream results, including several things I hadn't thought about before: One is that prograde streams are more affected by the bar and spiral arms in the disk than retrograde streams. Another is that we might be able to find globular-cluster streams around other galaxies nearby. That would be incredible! And since (as she showed) you can learn a lot about a galaxy just from the shape of a stream, we might not need to do much more than detect streams around other galaxies to learn a lot. It was a pleasure to serve on the committee, and it is a beautiful body of work.