I never had the guts to post anywhere my polemic against principal components analysis (mentioned in passing here previously). In thinking about it for the last year I have come up with various alternatives that are better, more appropriate, and justifiable from a generative modeling perspective. However, the simplest is the bilinear model: Treat each data point as being close to a linear combination of component spectra, and optimize both the coefficients and the component spectra. The optimization can be of chi-squared, so it can properly represent the errors (after all, in any scientific application you want to minimize chi-squared, not the mean squared error, which is what PCA does). This whole idea is re-re-re-discovery, even for me. It is the technique used in Blanton's kcorrect.
No comments:
Post a Comment