Blanton, Guantun Zhu (NYU), and I discussed the possibility of writing a paper on the archetypes system we set up for PRIMUS. The idea of the paper would be to split the archetypes finding and optimization out of any PRIMUS data paper because it has much wider applicability. The idea is to model a distribution of d-dimensional data by a set of delta functions in the d-dimensional space, with the set chosen to be the minimal set that adequately represents every data point. The nice thing is you can choose whatever operation you want to decide what represents what, and it can handle any kind of crazy degeneracies, missing data, or marginalization over nuisance parameters (think calibration, or extinction). The hard thing is that the search for the minimal set of archetypes is hard (in the technical algorithmic sense of the term) but Roweis cast the problem for us as a binary programming task, which is incredibly well handled by any number of open-source and commercial packages. For PRIMUS we used the IBM CPLEX code, which was astoundingly fast.

No comments:

Post a Comment