UnDisLo, day 1

Foreman-Mackey and I drove down to an undisclosed location on Monterey Bay to work with Charlie Conroy, Dan Weisz, and Ben Johnson (all UCSC). On the way down we discussed my new optimized photometry program. Once we arrived, we got to planning the week of hacking. We decided to focus on a few areas of mutual interest involving non-trivial data analysis. One area is combining spectroscopic and photometric information on galaxies and stars, where the spectroscopy is less reliable but far larger in total bytes. We have ideas about this. Another is learning a population distribution from noisy measurements. We have done this for exoplanets and photometric quasars and so on; Foreman-Mackey and I want to build general tools. Another area is learning the dependence of average (mean) spectra of galaxies on intrinsic properties like luminosity, redshift, metallicity, and velocity dispersion; Conroy has done great work in this area with blunt tools. We can help sharpen those. Should be a fun week!



I am on quasi-vacation this week (just staying up with email); hence no posts. But today I crashed a meeting in Napa Valley hosted by Wechsler (KIPAC), Conroy (UCSC), and others. I saw just a few talks, but they were excellent: Jeremiah Murphy (UFl) on supernovae explosions, Conroy on abundance anomalies on globular clusters, Blanton (NYU) on photometry, Finkbeiner (CfA) on photometric calibration, and Sarah Tuttle (UT) on the HETDEX spectrograph hardware. Great stuff.

Murphy showed us that there are crazy neutrino dynamics in the first fraction of a second in a supernova explosion; in particular there should be stellar oscillations imprinted on the neutrino signal! Conroy showed that there are light-element vs heavy-element abundance anti-correlations in essentially all globular clusters, and indications that some stars are very over-rich in helium. There is no good explanation. Blanton went carefully through the properties of astronomical imaging and photometry, for two hours. I loved it, and at the end, Kollmeier (OCIW) said she wanted more! Finbeiner showed that PanSTARRS and SDSS have great, precise, consistent photometry, and the calibration is all, entirely, self-calibration. This justifies strongly things I said at AAS this year. Tuttle talked about trade-offs in hardware design. The mass production of spectrographs for HETDEX is a huge engineering challenge.


red giants as clocks

Lars Bildsten (KITP) was in town and gave two talks today. In the first, he talked about super-luminous supernovae, and how they might be powered by the spin-down of the degenerate remnant, when spin-down times and diffusion times become comparable. In the second, he talked about making precise inferences about giant stars from Kepler and COROT photometry. The photometry shows normal modes and mode splittings, which are sensitive to the run of density in the giants; this in turn constrains what fraction of the star has burned to helium. There is a lot of interesting unexplained phenomenology related to the spin of the stellar core, which remains a puzzle. There was much more in the talk as well, but one thing that caught my interest is that some of the modes are exceedingly high in quality factor or coherence. That is, giants look like very good clocks. A discussion broke out at the end about whether or not we could use these clocks to constrain, detect, or measure gravitational radiation. Each star is much worse than a radio pulsar, but there are far, far more of them available for use. Airplane project!


probabilistic halo mass inference

In a low-research day, at lunch, Kilian Walsh pitched to Fadely and me a project to infer galaxy host halo masses from galaxy positions and redshifts. We discussed some of the issues and previous work. I am out of the loop, so I don't know the current literature. But I am sure there is interesting work that can be done, and it would be fun to combine galaxy kinematic information with weak lensing, strong lensing, x-ray, and SZ effect data.


permitted kernel functions, cosmology therewith

I spent a while at the group meeting of applied mathematician Leslie Greengard (NYU, Simons Foundation), telling the group how cosmology is done, and then how it might be done if we had some awesome math foo. In part we got on to how you could make a non-parametric kernel function for a Gaussian Process for the matter density field at late times, given that you need to stay non-negative definite. Oh wait, I mean positive semi-definite. Oh the things you learn! Anyway, it turns out that this is not really a solved problem and possibly a project was born. Hope so! I would love to recreate our discovery of the baryon acoustic feature with proper inference. At the group meeting, Foreman-Mackey and I had an "aha moment" about Ambikasaran et al's method for solving and taking the determinants of kernel matrices (Siva Ambikasaran (NYU) was in attendance), and then spent the post-group-meeting lunch in part quizzing Mike O'Neil (NYU) about how to structure our code to work fast in the three-dimensional case (the cosmology case).


fit all your streams, gamma-Earth

I spoke with Kathryn Johnston's group by phone for a long time at midday, about the meeting last week at Oxford. I opined that "the competition" is going to stick with integrable orbits for a while, so we can occupy the niche of more general potentials and orbit families. We discussed at some length the disagreement between Sanders (Oxford) and Bovy about how and why streams are different from orbits. Towards the end of that meeting, we discussed Price-Whelan's PhD projects, which he wants to include a balance of theory and real-data inference. I argued strongly that Price-Whelan should follow the Branimir Sesar (MPIA) "plan" which is to fit all the known streams and use those fits to figure out what observations are most crucial to do next. Plus maybe some theory.

In the afternoon, Foreman-Mackey and I discussed figures and content for his "gamma-Earth" paper (not "eta-Earth" but "gamma-Earth"). We decided to choose a fiducial model, work that through completely, and show all the other things we know as adjustments to that fiducial model. We also discussed how to show everything on one big figure (which would be great, for talks and the paper). Foreman-Mackey told me that the Tremaine papers on planet occurrence get the likelihood function for the variable-rate Poisson problem correct (including overall normalization); our only "advances" relative to the Tremaine papers are that we have a more flexible functional form for the rate function and its prior, and we fully account for the observational uncertainties (which basically no-one knows how to do at this point).


probabilistic grammar, massive graviton

In a low-research day, I saw two absolutely excellent seminars. The first was Alexander Rush (MIT, Columbia) talking about methods for finding the optimal parsing or syntactical structure for a natural-language sentence using lagrangian relaxation. The point is that the number of parsings is combinatorially large, so you have to do clever things to find good ones. He also looked at machine translation, which is a very related problem. At the end of his talk he discussed extraction of structured information from unstructured text, which might be applicable to the scientific literature.

Over lunch, Sergei Dubovsky (NYU) spoke about massive graviton theories and the recent BICEP2 results. He started by explaining that there are non-pathological gravity modifications in which the graviton is massive in its tensor effects, but doesn't get messed up in its scalar and vector effects. This means you have no change to the "force law" as it were (nor the black-hole solutions nor the cosmological world model) but you modify gravitational radiation. He then said two amazing things: The first is that the BICEP2 result, if it holds up, will put the strongest ever bound on the graviton mass, because it means that gravitational radiation propagated a significant fraction of a Hubble length. The second is that the BICEP2 data are better fit by a model with a tiny but nonzero graviton mass than by the standard massless theory. That's insane! But of course early days and much skepticism about the data, let alone the theory. Great talks today!



In the morning, Juna Kollmeier (OCIW) gave a great talk on the intergalactic radiation fields (called "metagalactic" for reasons I don't understand). She has found a serious conflict between what is computed by any reasonable sum of sources, what is inferred from the outskirts of galaxies, and what is needed for local IGM studies. One possible resolution, which she was not particularly endorsing, is heating from dark-matter decay or annihilation. Neal Weiner (NYU) loved that idea, for obvious reasons. During the talk, several good project ideas came up, some of them related to the kinds of things Schiminovich has been thinking about, and some related to SDSS-IV MANGA data. Kollmeier convinced us that a next-generation experiment will just see the IGM!

After lunch, Bob Kirshner (CfA) gave a nice talk about how much more precise supernova cosmology might become if we could switch to (or include) rest-frame near-infrared imaging. He endorsed WFIRST pretty strongly! He also agreed explicitly that getting more SNe is not valuable unless there are associated precision or redshift-distribution improvements. That is, the SNe are systematics-limited; hence his concentration on infrared data, where precision is improved.

Late in the afternoon, Vakili sketched out a fully probabilistic approach to interpolating the point-spread function in imaging between observed stars (to, for example, galaxies being used in a weak-lensing study). Again with the Gaussian Processes. They are so damned useful!


smoothness priors

Foreman-Mackey and I had a long discussion about how to normalize smoothness priors. That is, if you just "regularize" a fit using differences between bin heights (think: making a smooth histogram), it is hard to compute analytically the resulting implicit prior. In the end we decided to use a proper Gaussian Process prior on our histogram bin heights, because then at least the normalization is a determinant, and we can now compute those super fast. In general: If you can solve a problem with a mature technology or else invent something yourself, you should use the mature technology! In this case, that's Gaussian Processes.


NRFG, day three

We did informal discussion and wrap-up at the workshop today. In that discussion, we tried to focus on next steps for dynamics and inference, in the context of Gaia's upcoming early data release. In some ways the clearest conclusion from this discussion came from Sésar (MPIA), who said that we should analyze all the data we have on each Milky Way stream, to find out both what it tells us about the MW potential, and also what new data (better distances, more radial velocities, and so on) would bring us. That would permit us to plan the next round of observing proposals and surveys.

The last agenda item was Bovy showing us all galpy. Binney (Oxford) and Rix both agreed that we should be building public code bases in this style. Bovy's code is beautifully documented and decorated with tutorials.

In the car to the airport, Rix, Schlafly (MPIA), and I discussed the three-dimensional dust map. I opined that we might be able to apply a spatial prior to the map in a post-processing step, and Schlafly agreed in principle. I promised to look at the question on the flight home. If it works, it is great for my interim-sampling, importance-sampling brand!


NRFG, day two

Today was streams day at the Heidelberg–Oxford meetup. Sanders (Oxford) and Bovy showed their tidal-stream-modeling machineries, with emphases on the relationship between action-space and angle-space structure, or really frequency-space and angle-space structure. They both work only in integrable potentials, which creates one of the opportunities that Price-Whelan and I might exploit. That said, Sanders and Bovy have both developed some great computational simplicities that make their methods far faster than ours. For example, they can compute actions in any potential fast, angles pretty fast, and use affine approximations to local transformations to speed up integration over "true" phase-space positions. Bovy argued that Sanders's result from last year on streams not following orbits needs adjustment, when you consider the full frequency distribution hiding in every section of the stream. Sésar showed beautiful data on the Orphan Stream and advertised new work on RR Lyrae stars. He showed pretty convincingly that the Orphan Stream just ends abruptly at a location in which we could easily still observe it. Odd!

Schlafly (MPIA) and Sale (Oxford) showed work on three-dimensional dust mapping, which is essential both for understanding the stars and also for providing dust as a new tracer of the potential. Sales is working on non-parametrics with Gaussian Processes, like Bailer-Jones (MPIA) and Hanson (MPIA), while Schlafly is more old-school with independent angular pixels. That said, Schlafly has a complete map! We all emphasized the value for Schlafly of publishing not just the three-dimensional map, but also an easy-to-use tool for querying it.

Sormani (Oxford) made novel use of the "earth-mover distance" to compare features in images (models of the (l, v) distribution of gas in the Galactic Center and also data). The day ended with Martig (MPIA) showing beautiful galaxy simulations to investigate out-of-plane disk structure. It looks like the Monoceros-type stuff seen at the outskirts of the Milky Way might be quite typical.


NRFG, day one

Today was the start of the second annual Not Ready for Gaia workshop, which brings together the Galactic dynamics groups of Binney (Oxford) and Rix. We are in beautiful Oxford, staying at Merton College. I came as part of Rix's party. Many interesting ideas swung by; here are a few highlights for me:

Binney and Posti (Oxford) talked about equilibrium models for the Milky Way disk and halo. Binney claimed that any reasonable axisymmetric models will be well approximated by integrable axisymmetric models and argued strongly for using integrable models, at least as a starting point. Posti gave some examples.

Evans (Cambridge) argued that we should be using far more flexible models for gravitational potentials, arguing for expansions around simple models. He is spending some time finding simple models that support easy-to-calculate orthogonal bases for perturbations. Both he and Binney emphasized that a good expansion is one in which not too many terms are required for good representation of the data. That's a tall order!

I got into a fight with Evans, Belokurov (Cambridge), and Koposov (Cambridge) about their fitting of the Sagittarius stream. They constrain the stream using a few summary statistics of the data, not the full data. I claimed that their model was bad because it couldn't possibly go through all the data! However, by the end of the day they more-or-less convinced me that maybe the model does go close to the data. Still awaiting a very simple-to-make plot!

Schoenrich (Oxford) and Sanders (Oxford) both discussed the extended distribution function in phase space coordinates plus chemical and age coordinates. An argument broke out during Schoenrich's talk about whether you can "deconvolve" the radial migration process to find out the original birth radius of anything, and whether or not that would be useful. I enjoyed that! Rix pointed to evidence from Mitshang and collaborators that chemical tagging might not work; I have comments on that work which I hope to write up soon.

At the end of the day, Schoenrich showed what might be extremely good metallicity determinations using photometry. He can demonstrate precision, but not yet accuracy. Very interesting to watch; he is building on early work by Ivezic back in the day.


Dr Wu and Dr Duffell

Ronin Wu (CEA and Tokyo) came through NYU today, and told us about Herschel spectroscopy of M83. She finds that the spatially resolved and integrated spectra of star-forming galaxies seem to be saying slightly different things. She also made us think that the nucleus of M83 might be like a little mini-ULIRG.

At the end of the day, Paul Duffell (NYU) defended his PhD thesis. He had a huge crowd for the seminar and it was a great show, just as his thesis was a great read. He has moved supersonic numerical fluid simulation into some new regimes, with adaptive, moving meshes. His most exciting results (to me) are on the phenomenology of binary black holes, and (separately) gap-opening in protoplanetary disks. In both cases, he can make novel predictions for new observations. Gruzinov (NYU) and I encouraged him to make those predictions! Congratulations to Dr Paul Duffell on a great piece of work and welcome to the community of scholars (as it were).


Spergel and BICEP2

David Spergel (Princeton) came into town and talked about CMB cosmology. He spent a good fraction of his talk throwing doubt on the new, hot BICEP2 / Keck result on B-mode polarization on large scales. He is concerned that the experiments don't seem to satisfy the standard null tests. He laid out a set of requests, which are easy to satisfy. He noted that six new experiments can confirm the result if it is real, so we will know extremely soon. One amusing thing about Spergel's talk was the repeated point (obvious, but often overlooked) that because all CMB experiments are observing the same, single sky, they ought to agree to better than one-sigma, especially on large scales where cosmic variance dominates.


language parsing, photometric redshift templates

Brendan O'Connor (CMU) gave a talk late morning about natural language models for data science in the foreign policy and politics domains. He showed nice results based on subject–verb–object parsing of news stories. He also looked at some twitter data, showing an analysis of the emergence and propagation of the new twitter "words" (or acronyms) "idk" and "af". After the talk, at lunch, I complainined that natural language processing does not do a good job of understanding sentences and doesn't even give probabilistic results when uncertain. Yann LeCun (NYU) opined that understanding sentences is "AI complete", which is a phrase I need to use more often.

In the afternoon, Gabe Brammer (STScI) appeared and we talked about photometric redshifts. He is tweaking the model spectra using the data, and I suggested we go further in that direction. We gave each other homework on the subject.