Today Foreman-Mackey and I arrived in Tübingen to work with Schölkopf. On arrival, we got Dun Wang on the phone, because our trip to MPI-IS is designed to make huge progress on Wang's recalibration of the Kepler satellite detector pixels, using the variations that are found in common across stars. The way Schölkopf likes to say it is that we are capitalizing on the causal structure of the problem: If stars (or, really, pixels illuminated by stars) co-vary it must be because of the telescope, since the stars are causally disconnected. The goal of our work on this is to increase the sensitivity of the satellite to exoplanet transits.
We opened the day with two questions: The first was about why, despite this causal argument, we seem to be able to over-fit or fit out stellar variability. We are being careful with the data (using a train-and-test framework) to ensure that no information about the short-term variability of the star near any putative transit is leaking into the training of the predictive model. My position is that it is because our set of prediction stars might span the full basis of anything a star can do. We are using thousands of stars as features!
The second question was about why, in our residuals, there seems to be some trace of the spacecraft variability. We don't know that for sure, but just at an intuitive visual level it looks like the fitting process is not only not removing the spacecraft, but actually increasing the calibration "noise". We started Wang on tests of various hypotheses, and put Foreman-Mackey on trying models that are far more flexible than Wang's purely linear model.