2021-06-24

k near n?

Teresa Huang (JHU) has a nice paper (with Villar and me) that shows the risk and regularization of linear regression involving PCA. We discussed it more today, in particular whether we can say more about the regime in which the PCA dimensionality reduction (to k dimensions) doesn't do much (because k is close to the number of data points n). We think we can, because the Marchenko-Pastur distribution of eigenvalues is so skew: Cutting off even one small eigenvalue (k=n-1) can be useful!

No comments:

Post a Comment