supervised vs unsupervised

At lunch (we found a Thai place where the food was actually hot enough to satisfy Muandet) CampHogg discussed with Marshall the differences between (and meanings of) supervised and unsupervised methods in machine learning. In supervised learning, you want (usually) to obtain a probability for a label given the data, by looking at many labeled-data instances (training data). This can proceed without producing anything like a generative model for the data. I argued that—while supervised methods have roles to play in astronomy—no important high-level problem is really a supervised problem, for various reasons: One is that it is rare that science (at a high level) involves label transfer or classification. Classification is usually done in service to something more important. Another is that the main goal is to make discoveries, not classify instances into known already-discovered classes. Another is that science really is about creating causal, generative models of the data, while supervised learning explicitly deprecates that (in many cases, though not all, I should say; XDQSO and extreme deconvolution is a counter-example).

Very late in the day I wrote in what I hope to be the Astronomical Journal write-up of the image-combination and rank-statistics project we did for NIPS (and finished so close to the deadline last week).

No comments:

Post a Comment