I spent my day today at the IBM T. J. Watson Research Center in Yorktown Heights, NY, hosted by Siyuan Lu (IBM). I had great discussions with the Physical Analytics team, and got in some quality time with Bruce Elmegreen (IBM), with whom I overlap on inferences about the initial mass function. I spoke about exoplanet search and population inference in my talk. The highlight of the trip was a visit to the Watson group, where I watched them talk to Watson, but we also looked into the data center, which contains the Watson that won on Jeopardy!. We made some plans to teach Watson some things about the known exoplanets; he is an expert in dealing with structured data.
Vicki Kaspi (McGill) gave the Physics Colloquium talk today. She compared the fastest-known millisecond pulsar (which her group discovered) to the fastest commercial blenders in spin period. The pulsar wins, but it wins far more in surface speed: The surface of a millisecond pulsar is moving a significant fraction (like 0.1) of the speed of light! She talked about the uses of pulsars for precision measurement and testing of general relativity. It is just incredible that nature delivers us these clocks! I got interested during the talk in the spin constraints on the equation of state: We often see constraints on equation of state from mass measurements, but there must be equally compelling limits from the spin: If you are spinning such that your surface is moving at or even near the sound speed in the material, I think (or I have an intuition) that everything goes to hell fast.
At group meeting today Angus talked about her attempts to reproduce the asteroseismological measurements in the literature from Kepler short-cadence data. I think there is something missing, because we don't observe all the modes as clearly as they do. Our real goal is not just to reproduce the results of course; we discussed our advantages over existing methods: We have a more realistic generative model of the data; we can do multiple frequencies simultaneously; we can handle not just non-uniform time sampling but also non-uniform exposure times (which matters for high frequencies), and we can take in extremely non-trivial noise models (including ones that do detrending). I am sure we have a project and a paper, but we don't understand our best scope yet.
Just before lunch, Kyunghyun Cho (Montreal) gave a Computer Science colloquium about deep learning and translation. His system is structured as an encoder, a "meaning representation", and then a decoder. All three components are interesting, but the representation in the middle is a model system for holding semantic or linguistic structure. Very interesting! He has good performance. But the most interesting things in his talk were about other kinds of problems that can be cast as machine translation: Translating images into captions, for example, or translating brain images into sentences that the subject is currently reading! Cho's implication was that mind reading is just around the corner...
On the way to #astrohackny I learned that I can write usefully on the subway on my mobile phone, which possibly justifies its immense cost (in dollars and in valuable personal attention). At the meeting I pitched my proposal for foreground source separation in the Planck imaging. Price-Whelan pointed out that the incredibly flexible model I am proposing could be a model for absolutely anything in any domain. It is also massively degenerate. Undaunted, I proposed that we perform a first experiment using fake data. The idea is to generate data using a physical model and then fit it with this flexible data-driven model, and see what happens. We had various ideas about generating the data. I also spent some time interviewing Colin Hill (Columbia) about the Planck data analysis and data products.
At Columbia Pizza Lunch, there was much arguing about binary black holes, which is one of my favorite subjects. This paper (in Nature, so it is suspect out of the gate) gives indirect evidence of a binary quasar based on a very peculiar light curve. The authors looked at 250,000 quasars to find this beauty, which makes me wonder if the result is a fluke. A lot can happen when you take 250,000 draws from a stochastic process! Worth thinking about and checking.
[Lull in posting because of vacation in Quebec. Slept one (exceedingly cold) night in an actual, real-life igloo!]
At lunchtime in the NYU Center for Data Science there was a great talk by Daniela Huppenkothen about x-ray and gamma-ray astrophysics, for non-astronomers. She talked about imaging, spectroscopy, and time series, with a focus on the latter. She did a great job explaining the differences between astrophysics and other data-science domains. At the end there were good questions from (among others) neural scientist Bijan Peseran, who (comparing perhaps to his own domain) was interested in non-trivial time correlations among photon events. After all, neurons are all about non-trivial time correlations in spike trains.
Earlier in the day, Foreman-Mackey and I spoke about K2 projects and exoplanet population projects. The plan is to try some likelihood-free inference; we spent some time talking about technical details. In likelihood-free inference (ABC) one performs repeated simulations of the data; there are fundamental parameters, and then there are (usually) also random-number draws. We might want to sample separately in these, in some Gibbs-like way. Thanks to Brewer for getting us thinking along these lines.
Kendrick Smith (Perimeter) gave the astrophysics seminar today. He talked about non-Gaussianity. Various impressive things in his talk. One was the fact that they can measure, at high signal-to-noise, even four-point functions in the Planck data, and confidently see expected deviations from the Gaussian. This may be the only data set in the physical sciences in which a four-point function can be measured at very high precision. Another impressive thing is that, in the context of various inflationary scenarios (in which small non-Gaussian effects arise), he can evaluate the probability of a non-Gaussian initial condition, and therefore use MCMC to generate a realization. He is thinking in terms of initializing n-body simulations, but this is also critical for CMB data analysis: It means, potentially, that he could do full, non-approximate inference of inflationary parameters given the CMB data. It was a great talk, filled with good ideas about using very good mathematical physics to tractably connect theory to observations.
At group meeting, Sanderson asked us about potential statistics we could be using to see the effect of dark-matter substructure on old accreted stellar debris in the Milky Way. She is thinking about hot, old structures, not cold, young streams. In the afternoon, Malz and I talked about redshift likelihoods and photometric redshifts and the overlaps between what we have been thinking about and HETDEX with the thought that he might be able to become a collaborator on that project, given his past work there.
I had a great lunch today with the graduate students at Penn State. What a great group of students, many of whom are doing things that are sophisticated along directions of computation, hardware, and inference, all of which I love! After lunch I had many conversations, one highlight of which was Alex Hagen (PSU) and I ranting about "upper limits" (argh), and another of which was Runnoe (PSU) showing me very exciting results on possible black-hole–black-hole binaries. She has a set of systems where the broad line is shifted far from the narrow lines, and appears to have a relative acceleration over time. That's exactly what we were looking for with Decarli and Tsalmantza a couple of years ago!
Late in the day I gave my seminar, which was my new one about data-driven models and The Cannon. The questions were great; PSU is a place where the crowd knows all about hardware, statistics, and data analysis! A theme of the questions and answers was that we should be thinking about spectral representations that respect our beliefs about how spectra are causally generated (by absorption lines of finite line-spread function) rather than the straight wavelength-pixel domain. After the seminar, the conversation over dinner with faculty ranged around the modern research university and it's effective love—hate relationship between faculty and administrators!
Today was my first of two days at Penn State, hosted by Caryl Gronwall (PSU). I had many interesting conversations, too many to mention in detail. Some highlights were the following:
I had lunch with the HETDEX team, who updated me on the project status, and described some of the data properties. We discussed ways you might extract small signals from the enormous numbers of sky fibers that they will have. One point of philosophy: Imaging provides very good photometric measurements; spectroscopy usually does not. Why the difference? It is primarily because images have lots of blank regions where sky can be estimated and very precisely removed, and the point-spread function can be observed. HETDEX will have spectra in a big grid of pixels, so it will have all these properties but in a spectrograph. It might end up making some of the most precise spectrophotometric measurements ever!
Bastien (PSU) and I talked about her photometric-variability method for estimating stellar gravities. We made a prediction that the variability amplitude (which varies non-linearly with logg) might vary linearly with g. She promised to test that and get back to me.
Brandt (PSU) and I discussed the frightening situation with funding, hardware building, projects, and discovery in astronomy. How do we make sure we bring up the next generation of instrument builders if we can't keep funding hardware teams through the standard channels? I have some worries about the future of the profession: Once grant acceptance rates go below some value, the culture (and tenure success rate) might change dramatically, and the profession might lose important skills and people and ideas. We also talked about data-driven models of quasars!
Foreman-Mackey gave the brown-bag talk today. He described the method by which he and Ben Montet (Harvard) and others (including me) have found 20-ish new exoplanets in the K2 data. He is writing up the paper now, and fast (I hope).
The key technology is fitting a model for the systematics simultaneously with fitting the exoplanet transit model, for both search and characterization. This is to be contrasted with "fitting and subtracting" a systematics model prior to search. Fit-and-subtract is very prone to over-fitting; most such systems avoid over-fitting by severely restricting the freedom of the systematics model. If you fit the systematics and exoplanet simultaneously, the systematics will not "over-fit" or reduce the amplitude of the (already weak) exoplanet signals, even if the systematics are given a huge amount of freedom (as we give them). If, furthermore, you marginalize out the systematics (as we do), the method is very conservative with respect to the systematics model and search should be close to optimal (inasmuch as your systematics model is a good model for the data).
The upshot is that they find many exoplanets, multiplying the known yield from K2 by a factor of five, and finding some habitable-zone candidates. Also many eclipsing binaries. Very exciting stuff!
It was great to have John Johnson (Harvard), Andrew Vandenburg (Harvard), Ben Montet (Harvard), and Ruth Angus (Oxford & Harvard) all at CampHogg group meeting today! Each of these brought results to discuss. Vandenburg blew us away with a two-planet system discovered in the K2 data that is in a 3:2 resonance, but so precisely lined up with our line of sight that the transits are almost perfectly coincident in time! Incredible; so unlikely that we discussed the possibility that one of the bodies is an artificial planet placed there by the alien technologists to send us a signal! Johnson showed us an binary-star gravitational lens that is so close, all components can be spectroscopically monitored to compare lens-based inferences with radial-velocity inferences. Perfect agreement!
After show-and-tell, Montet and Foreman-Mackey discussed the state of their K2 search-and-characterization work, and the scope of the first paper, which they spent the rest of the day working on. One thing they mentioned was a brilliant idea from Tim Morton (Princeton) to look for what are known as "astrophysical" false positives: Apply their exoplanet transit depth measurement method not just to the brightness of the star, but also to the x and y position measurements of the star: If the transit appears (gets a finite "depth") in the position measurements, then it is probably a blend with a background star. Beautiful idea.
Along those same lines, we discussed the relationship between the systematics removal of Dun Wang and Foreman-Mackey (using stars to model stars) and that of Vandenburg (using centroid measurements to model stars) and how they are different and the same. I made my counter-intuitive point that the centroid of a star might be encoded at higher signal-to-noise in its brightness than in its actual, direct centroid measurements! This is related to my Kepler-thermometer project idea.
We spent the rest of the day hacking, on exoplanets, writing, and asteroseismology.
John Johnson and Ben Montet arrived from Harvard today and Tim Morton arrived from Princeton for serious hacking, and Johnson gave our physics colloquium. He spoke about how exoplanets are found and characterized, and the point that if you want to characterize the exoplanets precisely, you must also characterize the stars precisely. Amen! We recognized at the end of the day that we should be talking about The Cannon but didn't get a chance to connect on that. Tomorrow is hacking day, with his group (Exolab) and my group (CampHogg) all in one room!
At group meeting, Huppenkothen introduced us to methods for sampling "doubly intractable" Bayesian inference problems. The problem (and solution) in question is a variable-rate Poisson problem, where you have Poisson-distributed objects (like photons) arriving according to a mean rate that is varying with time, where that rate function is drawn from another process, in this case a Gaussian Process (taken through a function to make it non-negative). The best methods at the present day involve instantiating a lot of additional latent variables and then doing something like Gibbs sampling in the joint distribution of the parameters you care about and the newly introduced latent variables. We didn't understand everything about these complicated methods, but one of the authors, Iain Murray (Edinburgh) will be visiting the group next month, so we plan to make him talk.
Angus arrived for a few days of hacking and we talked about our super-Nyquist asteroseismology projects. We also started email conversations with the authors of this paper and this paper, both of which are impressive for their pedagogical presentation (as well as their results).
Today was the first day of Tuesday-morning hacking (as #astrohackny) up at Columbia, with a large crew, led by Adrian Price-Whelan. We discussed what we want to accomplish, what format to adopt, and what data-sets we might play with. The idea is to get work done, learn things, and hack as a group. Last year we did something similar as #nycastroml. Of the data sets we chose to make our principal, agreed-upon hacking data sets, I am most excited about the Planck data. Unfortunately, I am going to miss the next two meetings, but I have high hopes! We start by introducing the data sets and then we will move to pitching, refining, and executing projects on those data sets. We will also spend time getting our own personal projects done; it is partially a "parallel working" time.
On the train back downtown, I had a great conversation with Huppenkothen, Walsh, Vakili, and Ryan about scientific programming and programming languages. We discussed the conditions under which it would make sense to change languages (for example to Julia, the new language of hipness). I argued that there will never be a time during which it is obvious what language to be working in, especially if your data analysis is in any way cutting edge. I also argued that performance matters, even if you are only going to run your code once: The development cycle is unbearable when code is slow!
As my loyal reader knows, I am getting all interested in measuring stellar oscillations—asteroseismology—in data that have integration times (or sampling intervals) too long. For example, G dwarfs have oscillation periods in the 5-minute range, whereas the Kepler data is (by and large) 30-min exposures on 30-min centers. The Kepler data are typical for astronomy, but perhaps not typical examples for "Nyquist sampling" problems, in part because the exposures are integrations (or projections or finite-time averages) rather than samples of the stellar time series that we care about, and in part because the finite-diameter spacecraft orbit makes the periodic-in-spacecraft-time sampling aperiodic in barycentric time.
The integration point hurts me (it attenuates the amplitude of the super-Nyquist signal) but the aperiodicity helps. I discovered today, however, that I don't even need the aperiodicity: All I need is a good model (causal model, I probably should say) of how the data are generated by the stellar signal. I find that if I properly model the integration time, I can see the short-period signals in the data (or at least in Kepler-like fake data). This isn't surprising; it is like "side-band" frequency information in the standard Nyquist case. The key idea behind all this is that we are not ever going to take a Fourier transform or a Lomb-Scargle periodogram; these tools give you the frequencies in the integration-time-convolved stellar signal. We (Angus, Foreman-Mackey, and I) are going to model the stellar signal prior to convolution with the exposure time window.