2019-07-12

adversarial attacks on machine-learning methods

Today, in a surprise visit, Bernhard Schölkopf (MPI-IS) appeared in Heidelberg. We discussed many things, including his beautiful pictures of the total eclipse in Chile last week. But one thing that has been a theme of conversation with Schölkopf since we first met is this: Should we build models that go from latent variables or labels to the data space, or should we build models that go from the data to the label space? I am a big believer—on intuitive grounds, really—in the former: In physics contexts, we think of the data as being generated from the labels. Schölkopf had a great idea for bolstering my intuition today:

A lot has been learned about machine learning by attacking classifiers with adversarial attacks. (And indeed, on a separate thread, Kate Storey-Fisher (NYU) and I are attacking cosmological analyses with adversarial attacks.) These adversarial attacks take advantage of the respects in which deep-learning methods are over-fitting to produce absurdly mis-classified data. Such attacks work when a machine-learning method is used to provide a function that goes from data (which is huge-dimensional) to labels (which are very low-dimensional). When the model goes from labels to data (it is generative) or from latents to data (same), these adversarial attacks cannot be constructed.

We should attack some of the astronomical applications of machine learning with such attacks! Will it work? I bet it has to; I certainly hope so! The paper I want to write would show that when you are using ML to transform your data into labels, it is over-fitting (in at least some respects) but when you are using ML to transform labels into your data, you can't over-fit in the same ways. This all connects the the idea (yes, I am like a broken record) that you should match your methods to the structure of your problem.

No comments:

Post a Comment