In a short research day, I had a very useful conversation about robot fiber positioners for multi-object spectrographs with Peter Mao (Caltech), who is working on the Prime Focus Spectrograph for Subaru. He gave me a sense of the cost scale, the human effort scale, and the technical precision of such systems. This is critical information for the Letter of Intent that I and various others are putting in for the use of the SDSS hardware after the end of the current survey, SDSS-IV. We would like to go really big with the two APOGEE spectrographs, but if we want to do really large numbers of stars (think: millions or even tens of millions) we need to have robots place the fibers.
Today Boris Leistedt and Michael Troxel (Manchester) came to Simons to hack on a proposal to change the latter years of DES observing strategy. Their argument is that a small amount of u-band imaging (currently DES does none) could have a huge impact on photometric redshifts (particularly bias), which, in turn, could have a huge impact on the accuracy of the convergence mapping and large-scale structure constraints. They spent the day doing complete end-to-end simulations of observing, photometry, data analysis, and parameter estimation. I shouldn't really blog this, because it isn't my research, but it is very impressive!/p>
On the side, Leistedt and I checked in on our project to build a generative model of galaxy photometry, in which the full family of possible spectral energy distributions would be latent variables. Leistedt had a great breakthrough: If the SEDs are drawn from a Gaussian process, then the observables are also drawn from a Gaussian process, because projection onto redshifted bandpasses is a linear operation! He has code that implements this and some toy problems that seem to work, so I am cautiously optimistic.
At group meeting today, Michael Troxel (Manchester) showed up, as he and Boris Leistedt are working on proposals to modify DES observing strategy. Troxel showed us his convergence maps from the first look or engineering data from DES. Incredible! They look high in signal-to-noise and correlate very well with measured large-scale structure along the line of sight. It appears that this kind of mass density (or convergence) mapping is really mature.
After that, Dun Wang summarized a talk from last week by Jason Wright (PSU); this led to a conversation about the alien megastructures and Tabby's star. We discussed projects we might do on this. I asked what we would need to observe in order to be really convinced that this is alien technology. Since this question is very hard to answer, it is not clear that the “alien megastructure” explanation is really a scientific explanation at all. Oh so many good April Fools' projects.
Adrian Price-Whelan and I met at Columbia to discuss various things. We looked at the new paper of Reid et al on distances; which appeared on twitter this month as an argument that the Milky Way has spiral structure. Although this paper is not really dishonest—it explains what it did (though with a few lacunae)—it is misleading and wrong in various ways. The most important: It is misleading because it is being used as evidence for spiral structure (its figure 5 is being tweeted around!). But it also shows (in its figure 6) that even if there was no evidence at all for spiral structure in the data, their analysis would find a spiral pattern in the posterior pdf and distance estimators! It is wrong because it (claims to) multiply together posteriors (rather than likelihoods). That is, it violates the rules of probability that I tried to set out clearly here. I try not to use the word "wrong" when talking about other people's work; I don't mean to be harsh! The team on this paper includes some of the best observational astrophysicists in the world. I just mean that if you want to do probabilistic data analysis, you should obey the rules, and clearly state what you can and cannot conclude from the data.
At lunch, Jeno Sokolowski (Columbia) spoke about accreting white dwarfs in orbit around red giant stars. I realized during her talk that we can potentially generate a catalog of enormous numbers of these from our work (with Anna Ho) on LAMOST.
After Andy Casey brought our in-refereeing paper up to AASTeX6, I found myself very unhappy! The version 6 paper formatting makes many changes, and it is not clear how these changes were related to either community issues with AASTeX5 or else standards of typography and typesetting. I figured out ways to punk the formatting back to something that comes close to the recommendations in Bringhurst's The Elements of Typographic Style (my bible); I spent quite a bit of time on that today. Once I figure out how to raise issues on AASTeX6, I will do that, and also release a user-configurable patch package. A piece of unsolicited typographic advice for everyone out there: Obey what Bringhurst advises, unless you have very good reasons to do otherwise!
It was a great day. Melissa Ness's code with informative wavelength priors is working away, and Anna Ho has nearly finished her response to referee and her new paper on LAMOST red-giant ages and masses. I even did a bit on my own referee report, inspired by a pull request from Andy Casey.
I had a brief meeting with Michael Bottom (Caltech), who brought me this paper on coronagraph data analysis. It represents the data from a coronagraph as something low-rank that is fixed in the camera frame plus something arbitrary and sparse. He asked if we could add the point that the sparse component should be fixed in the sky frame. I opined that this was probably a trivial change. Indeed it is probably a few-line pull request to the code that would vastly improve the results. We agreed to think about it when he is finished his PhD and I am back to NYC.
We have realized that Anna Ho has the most detailed, feature-rich diagram of the temperature–gravity relationship for red giant stars ever created. She has 450,000 red giants from LAMOST labeled with APOGEE and The Cannon. None of these giants is supremely precisely labeled with either temperature or gravity, however, the sheer numbers make for visibility of very subtle features in the diagram that have (perhaps) never been seen before. Anna made visualizations and a gif of the data, and we debated what to do about it all, in the next paper. We are trying to get papers one and two done first.
I performed a code review with Melissa Ness, and we were able to speed up her code by a significant factor. We did it by reducing functionality, of course! But now she can run experiments fast, which is critical in the investigation phase. We made predictions for the figures she should make and what they would look like. She is working on putting far more informative prior beliefs into The Cannon.
Late in the day came the Physics Research Conference (Colloquium). It was by Maria Zuber (MIT), talking about the GRAIL mission to map the geoid of the Moon. The mission was amazingly simple: Two satellites in low orbit around the Moon, sending time-codes to each other and back to Earth. Do the math and figure out the gravitational shape of the Moon. She had many geological (or selenological?) results to discuss. In particular, she has an explanation for the strong morphological differences between the front side and the back side of the Moon. I was impressed that the data analysis was not in the slightest probabilistic: They just did the GPS-like thing (GRAIL is very like a miniature GPS) of finding the best solution to explain the data. It made it clear that a better analysis is possible. Not that I'm volunteering!
In a research-packed day, Anna Ho, Melissa Ness, and I met with Kevin Schlaufman (OCIW) to discuss what we should do with tens of thousands of stars, each of which has a dozen or more abundances measured with good precision. He had a million things to suggest! He said that there are physics-of-type-Ia-supernovae questions that could be answered with the distribution of Mn relative to Fe at low metallicity. He said that the Ni to Fe ratio might also tell us things about the deflagration-to-detonation transition. He said that we might look at the relationships between abundances and velocity differences (induced by, say, binary stars) to see if we can say something about chemical abundances and binarity. He also pointed out that there are other low-hanging fruits in the APOGEE data. So. Many. Projects. It was a great conversation and possibly the start of something.
In the late afternoon, Marcia Rieke (Arizona) gave the Astronomy Colloquium, about JWST. It was the first-ever Neugebauer memorial lecture (named after Gerry Neugebauer, one of my PhD advisors); she concentrated on engineering issues, and particularly those that flowed from Spitzer Space Telescope. The crowd was all interested in the deployment, since it is so insane. After the talk I interviewed Fiona Harrison (Caltech) about NuSTAR, which successfully deployed an immense boom to position its grazing-incidence optics.
I had lunch today with my PhD co-advisor Judy Cohen (Caltech) (who co-advised me with Roger Blandford—my official advisor—and Gerry Neugebauer). We had a wide-ranging conversation, about NSF and NASA funding situations, next-generation space projects, stellar abundance measurements, and even high-contrast imaging. That last subject got Judy to introduce me to Dimitri Mawet (Caltech), who builds coronagraphs. The conversation with Mawet made me seriously regret that I didn't talk about coronagraphs in my seminar yesterday on noise modeling! Mawet and I discussed the design space of coronagraphs that I learned about a few weeks ago from Remi Soummer (STScI). Here are a few random notes from our discussion:
For real coronagraphs, fresnel-style computations of the field in the camera are not good enough; you really need to simulate the full electromagnetic vector field. That's music to my applied-math ears! Fresnel-style calculations might be good enough for star-shade design. It seems to me like the highly non-convex optimization that is being done to design coronograph internals are fairly limited, in the sense that the full space has not been searched in any sense (and it is hard to see how it could be); there might be low-hanging fruit here. Phase-vortex plates might be a good component for new coronographs, possibly combined with the other kinds of masks and stops. There is no good technology for adaptively managing stops or phase plates in real time, actively. Flexible, actuated mirrors are such good technology, we should use that as much as we can first. There are various codes out there to do the electromagnetic calculations, but none of them (it sounds to me) are using everything they could be from an applied-math perspective. I might be wrong on that last point!
At the end of the day, there was a planetary science colloquium by Alex Wolszczan (PSU) about possible use of radio astronomy to find planets (something I discussed with Dave Spiegel many moons ago). He is looking at low-mass stars, some of which actually pulse (like pulsars!) in the radio. I didn't understand the physics of this, but there are possibilities that magnetized Jupiters might pulse in the radio, and also might respond to stellar coronal mass ejections and so forth.
Today was the first day of a week as the Kingsley Visitor at Caltech. My plan is to make progress with Anna Y. Q. Ho and the Cannon team (Ness, Casey, Rix) on various ongoing projects, including a few referee reports, a labeling of LAMOST stars with masses and detailed chemical abundances, and look at restricting the internals of The Cannon to make its results more interpretable, physically. We started with a project brain-storm and down-select, assigning each of the team members tasks for the week.
At lunch-time I joined the theorists-meet-observers lunch, which has existed at Caltech from before I was a graduate student. It was well attended, and we discussed (among other things) exoplanet populations, and what improvements we might get and need in the near future. There were lots of absolutely great questions.
At the end of the day, I gave the Astronomy Tea talk, about noise modeling and exoplanet search in Kepler and K2. I flashed some of Foreman-Mackey's new long-period (single-transit) planet discoveries and inferences in Kepler and (as usual) these results drew the majority of the questions!
Today was Jan Rybizki's last day in NYC. We did indeed find that our calibration offset differences were attributable to different amplitudes of AGB and SNe. That resolution led us to realize that Rybizki should be making fake data from his self-consistent model for me to model with my too-flexible model: Do I get biased answers and can I infer yields? In particular, we can make fake data with one set of yield tables and I can seed with (and put in priors based on) another set of tables. That's on the to-do list. We spent time talking about the future, in which we model the Milky Way with a hierarchical mixture of self-consistent models, permitting adjustments to the data calibration and yield tables.
Jan Rybizki (MPIA) and I continued to work on simple nucleosynthesis models. We have been looking at whether we can learn changes to the supernovae yield tables using the data, but of course we don't trust the data completely: We think there might be significant issues with the calibration of the chemical abundances. Today I included such offsets in my model and fit for them. The conservative idea is to put very weak priors on the offsets, and much more informative priors on the yield adjustments, to force the system to “fix the data” before it starts to fix the theory.
We compared offset conclusions, and we get slightly different results, but it looks like it can be attributed to the fact that Rybizki insists that the relative contributions of supernovae and AGB stars be realizable in a nucleosynthetic model and I don't! I decided that I need to include priors on the amplitudes of these chemical sources.
Today was a good day for coding. After Jan Rybizki (MPIA) spoke about his nucleosynthesis code and projects at group meeting, we got back to coding. I found a pernicious copy() bug, by which I mean a bug that comes from assigning a variable to another variable (in Python) when I should be copying the latter. In conversation with Hans-Walter Rix we set the scope for a two early papers. One scope is to show that new yield tables for supernovae and AGB stars are better than old yield tables. We can do this, even marginalizing out calibration offsets in the APOGEE abundances. Another scope is to show that one-zone models—even a mixture of one-zone models— cannot produce the heterogeneity of stellar abundances that we see. Even a scenario in which different stars see different fractions of the relevant supernovae wouldn't create the heterogeneity we see! We need stochastic yields or new nucleosynthetic pathways.
Jan Rybizki (MPIA) is in town this week, to sprint on nucleosynthetic models of the APOGEE chemical abundances that Andy Casey, Melissa Ness, and I generated with The Cannon. Rybizki has a nucleosynthesis model that takes an IMF, a SFR, infall, outflow, and so on, and computes the detailed abundances of all stars, using the latest and greatest knowledge about supernovae and AGB yields. His model is single-zone, and always homogeneously mixed, so it produces (always) a one-dimensional track through chemical-abundance space. Yesterday, while I was swanning around, he was showing that he can use the APOGEE abundances and his model to learn adjustments to the supernovae yield tables, or adjustments to the data zeropoints, and that he can get pretty good fits to the bulk (or really I should say mean trends) of the APOGEE red-clump-star data, with this very simple one-zone model.
We called Andy Casey in the morning, to discuss the projects available to us and get his feedback. During this conversation, we discussed the following point: Because there are only a few types of supernovae (and AGB) and because Rybizki's models are deterministic and well-mixed, in fact no set of one-zone models (not even a large mixture of them) can span much of abundance space. That is, if the one-zone models with deterministic yields and good mixing are close to true, then all stars should live in a low-dimensional subspace of the abundance space (convolved with noise deviations). If the stars live in a fairly high-dimensional subspace (and they appear to, and most of the literature says they do), then either there must be stochasticity to the yields, or else there must be separation of elements or bad mixing. You can't solve the problem (just) by going to the low-number-of-supernovae limit, because of the deterministic nature of the yields and the small number of supernovae types: You need real diversity to explain an interesting abundance space. This might be our low-hanging fruit, although we have got a lot of ideas now!
In related news, I wrote code today, for my project of building a data-driven nucleosynthesis model.
I started the day working on the title and abstract of the mature draft that Alex Malz is getting ready for submission. It is on the responsible use of probabilistic redshifts. As I often say, it is the title and abstract that are the most important parts of a paper. We don't usually spend enough time on either. I had things to say about both in this case.
In the middle of the day, Kat Deck (Caltech) gave a great talk at NYU about exoplanet dynamics. She spent half her time on her very clever analyses (and simplifications of) planet–planet interactions as revealed by transit timing variations. She has a canonical transformation that takes the problem to action–angle coordinates. The cool thing is that the variations are an aperiodicity, and in action–angle coordinates everything is completely periodic, so the canonical transformation transforms away the variations! She spent the second half of her talk on long-term orbital stability of planetary systems, showing some examples that are close to the limits of what's expected (from resonance overlap arguments) to be stable. If pairs of planets are found at similar periods and with large enough masses, Deck can show that they have to be trapped in a resonance. Her talk was a great mix of theory and phenomena, in the rich subject area of exoplanets. She also made some comments at the end about using these kinds of techniques to understand gravitational wave systems.
Late in the day I visited Ellie Schwab (CUNY) and Kelle Cruz (CUNY) to discuss Schwab's project to measure the flaring rates of low-mass stars as a function of effective temperature. David Rodriguez (AMNH) also helped us out. We discussed a possible likelihood function that makes use of the magic of Gaussians to make everything tractable and fast.
In a low-research day, I had a short call with Andy Casey about various things Cannon-related. I pitched the very simple project of looking at how our results degrade with spectroscopic resolution and signal-to-noise. We have done signal-to-noise tests, but we have never degraded the spectroscopic resolution, which ought to be very informative. There is folklore that you can't do anything at resolutions less than 20,000 (or 30,000, or 100,000, etc.). Is there a resolution below which we can't extract abundances? Or do things degrade smoothly?
Another issue with the current versions of The Cannon is that the abundances are not strictly interpretable as pure abundances: Because we let the system learn whatever relationships it wants, it can use (say) titanium lines to help estimate the (say) magnesium abundance. In general it might do this if there is an empirical covariance between titanium and magnesium in the training set (which there will be, in general). So we have to be careful how we interpret its output. Melissa Ness is working on a solution to this, which is to censor the wavelengths available to some (or all) elements to those wavelengths that we know (from atomic physics) are conceivably relevant. This will lead to much more interpretable results. If the censoring is correct, it should also lead to better results!