2020-09-23

linear regression sprint

Today Teresa Huang (JHU), Soledad Villar (JHU), and I met for a sprint on a possible paper for the AISTATS conference submission deadline. We are thinking about pulling together results we have on double descent (the phenomenon that good, predictive models can have many more data than parameters, or many more parameters than data, but in-between, unless you are careful, they can suck) and on adversarial attacks against regressions.

I have a fantasy that we can unify a bunch of things in the literature, but I'm not sure. I ended up drawing a figure on my board that might be relevant: The out-of-sample prediction error, the (equivalent of the) condition number of the data SVD, and the susceptibility to “data-poisoning attacks” all depend on number of data n and the number of parameters p in related ways. Our main conclusion is that regularization (or dimensionality reduction, which amounts to the same thing) is critical.

No comments:

Post a Comment