2011-07-22

post 1500, hierarchical classification

I spent much of the day working on toy examples for, and presentation for, my MPIA Hauskolloquium on classification. I argued that marginalized likelihood is the way to go (before applying your utility, that is, your long-term future discounted free cash flow model), but that for it to work well you need to learn the relevant priors from the data you are classifying. That is, if you are working at the bleeding edge (as you should be), the most informative data set you have (about, say, star–galaxy classification at 29.5 magnitude) is the data set you are using; if it isn't: Change data sets! Or another way to put it: Any labeled data you have to use for classification—or any priors you have—are based on much smaller, or much worse, data sets. So for my seminar I argued that you should just learn the priors hierarchically as you go. I demonstrated that this program all works extremely well, in realistic demos that involve fitting with wrong and incomplete models (as we do in the real world).

[This is post 1500. And this exercise of daily blogging still is a valuable (to me) part of my practice.]

No comments:

Post a Comment