2020-08-04

derivatives in generative and discriminative models

A discriminative model is a method for finding a function of your features (like your spectrum) that delivers a prediction or expectation for your labels (like the temperature and metallicity of the star). A generative model is a method for finding a function of your labels that delivers a prediction or expectation for your features. In the discriminative case, you can take a derivative of labels wrt data. In the generative case you can take a derivative of the data wrt the labels. How are these related?

In the one-dimensional calculus context, these things are just inverses of each other. But in the multivariate labels and multivariate features case, it isn't so simple: Some of the features don't depend at all on the labels, so the generative derivative is zero; that doesn't make the discriminative derivative infinity!

The answer is the pseudo-inverse: You can convert a generative model into a discriminative model (locally, or to linear order) using the Taylor series and then linear least squares inference. That delivers a discriminative model (locally) and thus the other derivative you might want. The pseudo-inverse is the thing that appears in linear least squares. In numpy, the pseudo-inverse of X is solve(X.T @ X, X.T) or the operator embodied in lstsq().

No comments:

Post a Comment