2019-09-03

finding adversarial examples

One of my projects this Fall (with Soledad Villar) is to show that large classes of machine-learning methods used in astronomy are susceptible to adversarial attacks, while others are not. This relates to things like the over-fitting, generalizability, and interpretability of the different kinds of methods. Now what would constitute a good adversarial example for astronomy? One would be classification of galaxy images into elliptical and spiral, say. But I don't actually think that is a very good use of machine learning in astronomy! A better use of machine learning is converting stellar spectra into temperatures, surface gravities, and chemical abundances.

If we work in this domain, we have two challenges. The first is to re-write the concept of an adversarial attack in terms of a regression (most of the literature is about classification). And the second is to define large families of directions in the data space that are not possibly of physical importance, so that we have some kind of algorithmic definition of adversarial. The issue is: Most of these attacks in machine-learning depend on a very heuristic idea of what's what: The authors look at the images and say “yikes”. But we want to find these attacks more-or-less algorithmically. I have ideas (like capitalizing on either the bandwidth of the spectrograph or else the continuum parts of the spectra), but I'd like to have more of a theory for this.

No comments:

Post a Comment