2011-06-04

philosophy

On the plane home from La Palma, I read this paper (PDF) by Gelman and Shalizi on the philosophy behind statistics. They, despite being Bayesians, argue that since we don't literally believe as True (with a capital T) any of our models (and I argued the same here), the literal probability interpretation of Bayesian reasoning is flawed. They argue that all the important moments of statistics occur at the points of model checking, model investigation, and the choice of which models to compute in the first place. Gelman and Shalizi's paper was a great post-meeting read; now enough philosophy and back to work!

4 comments:

  1. Thanks for this: Pr(T(d_rep)>T(d)|x,H) is going to provide a lot of comfort and joy, I predict! Also, it's true: while the Bayes factor is clearly the "right thing" (or rather, the only available thing) to compute when you have two and only two models, that situation does seem unrealistic in practice. And yes, it does make me uneasy enunciating Pr(H|d)... More backtracking and rethinking to come on twitter!

    ReplyDelete
  2. That was an interesting read. Thanks for the link. I'm not sure where I stand in these debates, because I tend to agree with everyone after hearing what they say. I think a lot of these issues can be clarified by distinguishing between pragmatism and principle.

    Model checking as advocated in this paper is clearly useful and easy, but is on the pragmatic side. There's lots to question about it. Why should we compare the data to the posterior predictive distribution p(d'|d) and not the prior predictive p(d)? How do we choose the test statistic to test for outlieriness, and why are we back to p-values again? After all, isn't this just a high dimensional version of having some prior p(x) and then learning that x_true was actually somewhere where p(x) was low? We know how to deal with that right?

    All of this seems like an ad-hoc way to detect the fact that your model (whatever hypothesis space and prior probabilities you actually write down [I consider p(H), p(theta|H) and p(D|theta,H) all to be prior probabilities]) is often a crap representation of your actual prior beliefs. i.e. The probabilities you've plugged into your Jaynes-robot aren't ones you would actually agree with if you elaborated their consequences.

    So the in-principle solution then would be to calculate the consequences of your model and make sure you agree with all of them. If you've established that, then once you restrict to a single data set, you'll agree with the conclusions because you already agreed with them. :-)

    ReplyDelete
  3. I suspect Cosma Shalizi might be a bit amused to be described as a "Bayesian", given his tendency to be somewhat skeptical of Bayesian statistics. But I certainly agree that it's an interesting read.

    ReplyDelete
  4. Peter: Good point! My apologies to Shalizi.

    ReplyDelete