#DtU17, day two

Today I dropped in on Detecting the Unexpected in Baltimore, to provide a last-minute talk replacement. In the question period of my talk, Tom Loredo (Cornell) got us talking about precision vs accuracy. My position is a hard one: We never have ground truth about things like chemical abundances of stars; every chemical abundance is a latent variable; there is no external information we can use to determine whether our abundance measurements are really accurate. My view is that a model is accurate only inasmuch as it makes correct predictions about qualitatively different data. So we are left with only precision for many of our questions of greatest interest. More on this in some longer form, later.

Highlights (for me; very subjective) of the days' talks were stories about citizen science. Chris Lintott (Oxford) told us about tremendous lessons learned from years of Zooniverse, and the non-trivial connections between how you structure a project and how engaged users will become. He also talked about a long-term vision for partnering machine learning and human actors. He answered very thoughtfully a question about the ethical aspects of crowd-sourcing. Brooke Simmons (UCSD) showed us how easy it is to set up a crowd-sourcing project on Zooniverse; they have built an amazingly simple interface and toolkit. Steven Silverberg (Oklahoma) told us about Disk Detective and Julie Banfield (ANU) told us about Radio Galaxy Zoo. They both have amazing super-users, who have contributed to published papers. In the latter project, they have found (somewhat serendipitously) the largest radio galaxy ever found! One take-away from my perspective is that essentially all of the discoveries of the Unexpected have happened in the forums—in the deep social interaction parts of the citizen-science sites.


galaxy masses; text as data

After a morning working on terminology and notation for the color–magnitude diagram model paper with Lauren Anderson (Flatiron), I went to two seminars. The first was Jeremy Tinker (NYU) talking about the relationship between galaxy stellar mass and dark-matter halo mass as revealed by fitting of number-count and clustering data in large-scale structure simulations. He finds that only models with extremely small scatter (less—maybe far less—than 0.18 dex) are consistent with the data, and that the result is borne out by follow-ups with galaxy–galaxy lensing and other tests. This is very hard to understand within any realistic model for how galaxies form, and constitutes a new puzzle for standard cosmology plus gastrophysics.

In the afternoon there was a very wide-ranging talk by Mark Drezde (JHU) on data-science methods for social science, intervention in health issues, and language encoding. He is interested in taking topic models and either deepening them (to make better features) or else enriching their probabilistic structure. It is all very promising, though these subjects are—despite their extreme mathematical sophistication—in their infancy.


one paragraph per day

[I have been on vacation for a week.]

All I have done in the last week is (fail to) keep up with email (apologies y'all) and write one paragraph per day in the nascent paper with Lauren Anderson (Flatiron) about our data-driven model of the color–magnitude diagram. The challenge is to figure out what to emphasize: the fact that we de-noise the parallaxes, or the fact that we can extend geometric parallaxes to more distant stars, or the fact that we don't need stellar models?


57 elements; research meetings

Today the astro seminar was given by Or Graur (CfA). He spoke about various discoveries he and collaborators have made in type Ia supernovae. For me, the most exciting was the discovery of atomic-mass-57 elements, which he can find by looking at the late-time decay: The same way we identify the mass-56 elements from timing supernovae decays at intermediate times, he finds the mass-57 elements. The difference is that they are at much later times (decay times in the years). He pointed out a caveat, which is that the late-time light curve can also be affected by unresolved light echoes. That's interesting and got me thinking (once again) about all the science related to light echoes that might be under the radar right now.

I hosted today my first-ever undergraduate research meeting. I got together undergraduates and pre-PhD students who are interested in doing research, and we discussed the Kepler and APOGEE data. My plan (and remember, I like to fail fast) is to have them work together on overlapping projects, so they all have coding partners but also their own projects. With regular meetings, it can fit into schedules and become something like a class!


stellar twins and stellar age indicators

In the stars group meeting at CCA, Keith Hawkins (Columbia) blew us away with examples of stellar twins, identified with HARPS spectra. They were chosen to have identical derived spectroscopic parameters in three or four labels, but were amazingly identical at signal-to-noise of hundreds. He then showed us some he found in the APOGEE data, using very blunt tools to identify twins. This led to a long discussion of what we could do with twins, and things we expect to find in the data, especially regarding failures of spectroscopic twins to be identical in other respects, and failures of twins identified through means other than spectroscopic to be identical spectroscopically. Lots to do!

This was followed by Ruth Angus (Columbia) walking us through all the age-dating methods we have found for stars. The crowd was pretty unimpressed with many of our age indicators! But they agreed that we should take a self-calibration approach to assemble them and cross-calibrate them. It also interestingly connects to the twins discussion that preceded. Angus and I followed the meeting with a more detailed discussion about our plans, in part so that she can present them in a talk in her near future.


abundance dimensionality, optimized photometric estimators

Kathryn Johnston (Columbia) organized a Local-Group meeting of locals, or a local group of Local Group researchers. There were various discussions of things going on in the neighborhood. Natalie Price-Jones (Toronto) started up a lot of discussion with her work on the dimensionality of chemical-abundance space, working purely with the APOGEE spectral data. That is, they are inferring the dimensionality without explicitly measuring chemical abundances or interpreting the spectra at all. Much of the questioning centered on how they know that the diversity they see is purely or primarily chemical rather than, say, instrumental or stellar nuisances.

At lunch time there were amusing things said at the Columbia Astro Dept Pizza Lunch. One was a very nice presentation by Benjamin Pope (Oxford) about how to do precise photometry of saturated stars in the Kepler data. He has developed a method that fully scoops me in one of my unfinished projects: The OWL, in which the pixel weights used in his soft-aperture aperture photometry are found through the optimization of a (very clever, in Pope's case) convex objective function. After the Lunch, we discussed a huge space of generalizations, some in the direction of more complex (but still convex) objectives, and others in the direction of train-and-test to ameliorate over-fitting.


JWST opportunity

Benjaming Pope (Oxford) arrived in New York today for a few days of visit, to discuss projects of mutual interest, with the hope of starting collaborations that will continue in his (upcoming) postdoc years. One thing we discussed was the JWST Early Release Science proposal call. The idea is to ask for observations that would be immediately scientifically valuable, but also create good archival opportunities for other researchers, and also help the JWST community figure out what are the best ways to make best use of the spacecraft in its (necessarily) limited lifetime. I am kicking around four ideas, one of which is about photometric redshifts, one of which is about precise time-domain photometry, one of which is about exoplanet transit spectroscopy, and one of which is about crowded-field photometry. The challenge we face is: Although there is tons of time to write a proposal, letters of intent are required in just a few weeks!


nucleosynthesis and stellar ages

Benoit Coté (Victoria & MSU) came to NYU for the day. He gave a great talk about nucleosynthetic models for the origin of the elements. He is building a full pipeline from raw nuclear physics through to cosmological simulations of structure formation, to get it all right. There were many interesting aspects to his talk and our discussions afterwards. One was about the i-process, intermediate between r and s. Another was about how r-process elements (like Eu) put very strong constraints on the rate at which stars form within their gas. Another was about how we have to combine nucleosynthetic chemistry observations with other kinds of observations (of, say, the PDMF, and neutron-star binaries, and so on) to really get a reliable and true picture of the nucleosynthetic story.

Late in the afternoon, I met with Ruth Angus (Columbia) to further discuss our project on cross-calibrating (or really, self-calibrating) all stellar age indicators. We wrote down some probability expressions, designed a rough design for the code, and discussed how we might structure a Gibbs sampler for this model, which is inherently hierarchical. We also drew a cool chalk-board graphical model (in this tweet), which has overlapping plates, which I am not sure is permitted in PGMs?


making our own Gaia pipeline

My writing today was in the introduction to the paper Lauren Anderson (Flatiron) and I are writing about the color-magnitude diagram and statistical shrinkage in the Gaia TGAS—2MASS overlap. My view is that the idea behind the project is the same as the fundamental idea behind the Gaia Mission: The astrometry data (the parallaxes) give distances to the nearby stars; these are used to calibrate spectrophotometric models, which deliver distances for the (far more numerous) distant stars. Our goal is to show that this can be done without any involvement of stellar physics or physical models of stellar structure, evolution, or photospheres.


velocities for APOGEE stars

At stars group meeting, run by Lauren Anderson (Flatiron), new graduate student Jason Cao (NYU) showed us his work on measuring radial velocities for individual-visit APOGEE spectra. He has a method for determining the radial velocity that does not involve interpolating either the data or the model. During his presentation, Jo Bovy (Toronto) pointed out that, actually, the APOGEE team appears to do an interpolation of the data after the one-d spectral extraction. That's unfortunate! But anyway, we have a method that doesn't involve any interpolation which could be used on a survey that doesn't ever do interpolation before or after extraction. And yes, you can extract a spectrum on any wavelength grid you like, from any two-d data you like, without doing interpolation! The group-meeting attendees had many constructive comments for Cao.


tuesday lunch, fundamental physics

I spent the day at Princeton, hosted by Scott Tremaine (IAS). Tuesday lunch is still alive and well in Princeton, though I was shocked to find it happening in the Princeton Physics Department's Jadwin Hall. One beautiful result shown at the lunch was presented by Kento Masuda (Princeton), looking at hot exoplanets with eccentric outer companions. He has two examples that show dramatic transit timing and duration change events, presumably caused by a conjunction near the outer planet's periastron. The data are incredible and he generates a very informative (think: narrow) posterior on the outer planet's properties, despite the fact that the outer planet is not directly observed at all (and has a many-year period).

I spent most of my research time with Price-Whelan (Princeton) and Tremaine, discussing projects on the go. We spent a lot of time talking about whether it will be possible to learn fundamental things about the dark matter by building dynamical models of the stellar motions in the Milky Way. Tremaine came up with lots of reasons to be skeptical! However, if the dark matter doesn't annihilate (and even whether or not it is found in an underground lab), dynamics will be our only real tool. So I am confused. To me, it is much more interesting to model the dynamics of the Milky Way if it will tell us what the dark matter is than if it will tell us nothing more than some details about our contingent collapse and assembly history within a generic dark-matter scenario.

Getting even more philosophical, Tremaine and I discussed the question: What astronomy projects are purely descriptive of the "weather" of the Universe, and what projects get at fundamental physical processes? Even stronger: What astronomy projects might lead to changes to our beliefs about the fundamental physics itself? And how important is that, anyway? Revealing our prejudices, we both wanted to say that the most important areas of astronomy are those that might lead to changes in our beliefs about fundamental physics. But then we both wanted to say that exoplanet science is super-interesting! How to resolve this? Or is there a conflict?


#GaiaMission selection functions?

The only research today was discussion of projects with Daniela Huppenkothen (NYU), Lauren Anderson (Flatiron), and Jo Bovy (Toronto). One subject of conversation was the need for selection functions in analyzing Gaia data, both now and in the future. Bovy is working on a selection function for Gaia DR1 TGAS and we discussed how we might generate a selection function for the final Gaia data release. I have a plan, but it involves making a simulated Gaia mission to get it started.


#JudyFest, day 3

Today was the third and final day of The Galactic Renaissance. Rosie Wyse (JHU) and Branimir Sesar (MPIA) both showed evidence for vertical ripples going outwards in the Milky Way disk. These could plausibly be raised by an encounter with Sagittarius or something similar. However, Sesar argued that the amplitude is too large to be anything reasonable in the Local Group. That suggests that maybe the evidence isn't secure?

Raja GuhaThakurta (UCSC) mentioned the argument that the halo is worth observing because you can see the accretion history, at least in principle. There were talks after his by Sales (UCR), Lee (Vanderbilt), and Bonaca (CfA) on the observed and simulated properties of our halo.

Phil Hopkins (Caltech) and Yves Revaz (EPFL) gave impressive galaxy simulation results. Hopkins's renderings are just the bomb, and we discussed them in some detail afterwards. Hopkins claimed that low-mass galaxies (at least star-forming ones) are always so far out of steady-state, you can never measure their masses using virial or other steady-state indicators. He also brought up the point that the dust in the ISM has different dynamics than the molecular gas, and therefore there might be insane separation of material as stars form. I also discussed that with him afterwards.


#JudyFest, day 2

Today was the second day of The Galactic Renaissance. Two scientific themes of the day were globular-cluster star abundance patterns, and stellar models that account for 3-d and non-thermal-equilibrium (NLTE) effects. On the former, it was even suggested by one speaker that the existence of chemical-abundance variations of certain kinds might be part of the definition of a globular cluster! There are some extreme cases, and various claims that the most extreme examples might be the stripped centers of ancient accreted galaxies!

On the stellar modeling front, there were impressive demonstrations from Frebel (MIT), Bergemann (MPIA), and Thygesen (Caltech) that improving the realism of the physical inputs to stellar models improves their precision and their accuracy. Thygesen did a very nice thing of using (relatively cheap) 1-D models to inform functional forms for interpolation across grid points of a (relatively expensive) 3-D model grid. That got me interested in thinking about physics-motivated or physics-constrained interpolation methods, which could have value in lots of domains.

In a session about Judy's scientific and intellectual life, Steve Shectman (OCIW) described what the world was like in 1967, when Judy Cohen (Caltech) started graduate school. It was a time of optimisim, disruption, and violence. This resonated with things I know about Cohen, because she and I used to discuss the historical context of her origins as an astronomer back when I was a graduate student.

Another highlight of the day was a discussion with Kim Venn (Victoria) and Matt Shetrone (Texas) about persistence effects that damage a significant fraction of spectra in a significant fraction of APOGEE exposures. We discussed the trade-offs between correction and avoidance, and what it might take to fix the problem.

Over dinner, I and others delivered tributes to Judy Cohen. She really has had an amazing scientific impact, and also been a wonderful person, and had a big influence on me. She also said nice things about me in her own speech!


#JudyFest, day 1

Today was the first day of the meeting The Galactic Renaissance, a meeting in honor of Judy Cohen (Caltech), who was one of my (three) PhD advisors (with Blandford and Neugebauer). On the plane to the meeting I built a brand-new talk about data-driven models of stars, bringing in stuff we are doing in HARPS and Gaia and connecting it to what we are doing with The Cannon.

One highlight of the meeting was Steve Shechtman (OCIW) talking about a new infrared multi-object spectrograph he is designing for Magellan. He talked about some interesting instrument design considerations, which was fitting, because Judy Cohen built (with Bev Oke and a great team) the most highly used instrument on the Keck Telescopes (the LRIS spectrograph, which I used in my PhD work). One point is that all spectrographs are fundamentally trade-offs between spatial and spectral extent, because the total number of pixels is limited. He noted that the spectrograph cost and weight is a strong function of the diameter of the collimated beam, which is simultaneously obvious and non-trivial. Finally, he noted that putting an imaging mode into a multi-object spectrograph substantially increases the cost and complexity: It requires that there not be chromatic optics, which imagers hate but spectrographs don't mind at all!

Another highlight was a talk about Solar twins by Jorge Melendez (São Paulo). By using carefully chosen twins, he can measure abundances better than 0.01 dex. He showed some great data. But even more absurd he is looking at binary stars, both members of which are themselves solar twins! But then if that isn't absurd enough, he also has binary stars, both members of which are themselves solar twins, and one of which has an exoplanet! Awesome. He mentioned that [Y/Mg] is a (possibly complicated) age indicator, which is relevant to things Ruth Angus (Columbia) and I have been thinking about.