Various tasks involved in the re-start of the academic year took out my research time today. But I did have a productive conversation with Alex Malz (NYU) about his current projects and priorities. One question that Malz asked is: Imagine you have various Bayesian inference methods or systems, each of which performs some some (say) Bayesian classification task. Each inference outputs probabilities over classes. How can you tell which inference method is the best? That's a hard problem! If you have fake data, you could ask which puts the highest probabilities on the true answer. Or you could ask which does the best when used in Bayesian decision theory, with some actions (decisions) and some utilities, or a bag of actors with different utilities. After all, different kinds of mistakes cost different actors different amounts! But then how do you tell which inference is best on real (astronomical) data, where you don't know what the true answer is? Is there any strategy? Something about predicting new data? Or is there something clever? I am out of my league here.
An 'inference method' is just a model, so isn't this just a job for Bayesian model comparison?
ReplyDeleteOr if, like Gelman, you're not a fan of model comparison (and I've been coming around to this point for all sorts of practical reasons), can't you do some kind of posterior predictive checking?
I admit I haven't thought about these issues in the context of classification, so it may be more complicated when you've got a series of categorical (discrete) parameters.
I think this is correct. And to make the decision, you (probably) have to integrate utilities.
Delete