2020-10-29

big, huge linear regressions

I spoke (remotely) at CCA today about linear regression (fitting linear models for the purposes of prediction), when the linear regressions have huge numbers of parameters. Yes huge: More than the number of data points! It turns out that even though you can thread the data perfectly—your chi-squared will be exactly zero—you can still make good predictions for held-out data. That surprised the crowd, which, in turn, surprised me: Many in this crowd use Gaussian processes and deep learning, both of which have these properties: More parameters than data, can fit any training data perfectly, and yet still make good, non-trivial predictions on held-out data.

My slides are here. Should I write something about all this?

No comments:

Post a Comment