Based on conversations with Soledad Villar, Teresa Huang, Zach Martin, Greg Scanlon, and Eva Wang (all NYU), I worked today on establishing criteria for a successful adversarial attack against a regression in the natural sciences (like astronomy). The idea is you add a small, irrelevant amount u to your data x and it changes the labels y by an unexpectedly large amount. Or, to be more specific:
- The L2 norm (u.u) of the vector u should be equal to a small number Q
- The vector u should be orthogonal to your expectation v of the gradient of the function dy/dx
- The change in the inferred labels at x+u relative to x should be much larger than you would get for the same-length move in the v direction!
No comments:
Post a Comment