I worked on Gaussian Processes with Foreman-Mackey today. In our formulation of the "search Kepler data for transiting planet candidates" problem, what is usually called "the model" of the transit sets (in our formulation) the mean of a Gaussian Process, and the photon noise, stellar variability, and spacecraft-induced variability is all subsumed into a covariance function that sets the variance tensor of that Process. The nice thing about this formulation is that the estimation or inference is just what you would do with linear chi-squared inference (optimization or sampling or whatever you like), but just with a much more complicated covariance matrix. In particular, the covariance matrix is now nothing like diagonal.
We are off the usual path of Gaussian Processes, but only because we have a non-trivial function for the mean of the Process. That isn't much of a change, but it is rarely mentioned in the usual texts and descriptions of GPs. I think the fundamental reason for this is that GPs are usually used as incredibly flexible models for barely understood data. In typical astronomical (and physical) problems, there are parts of the problem that are barely understood and require flexible models, but other parts that are very well understood and require pretty rigid, physically motivated models. In our use of GPs, we get both at once: The mean function is set by our physical model of transiting exoplanet; the variance function is set to capture as well as possible all the variability that we don't (officially) care about.