2019-05-03

Dr Alex Malz!

Today it was my great pleasure to participate in the PhD defense of my student Alex Malz (NYU). His dissertation is about probabilistic models for next-generation cosmology surveys (think LSST but also Euclid and so on). He showed that it is not trivial to store, vet, or use probabilistic information coming from these surveys, using photometric-redshift outputs as a proxy: The surveys expect to produce probabilistic information about redshift for the galaxies they observe. What do you need to know about these probabilistic outputs in order to use them? It turns out that the requirements are strong and hard. A few random comments:

On the vetting point: Malz showed with an adversarial attack that the ways cosmologists were comparing photometric-redshift probability outputs across different codes were very limited: His fake code that just always returned the prior pdf did as well on almost all metrics as the best codes.

On the requirements point: Malz showed that you need to know all the input assumptions and priors on any method in order to be able to use its output, especially if its output consists of posterior information. That is, you really want likelihood information, but no methods currently output that (and many couldn't even generate it because they aren't in the form of traditional inferences).

On the storage point: Malz showed that quantiles are far better than samples for storing a pdf! The results are very strong. But the hilarious thing is that the LSST database permits up to 200 floating-point numbers for storage of the pdf, when in fact the photometric redshifts will be based on only six photometric measurements! So, just like in many other surveys that I care about, the LSST Catalog will represent a data expansion, not a data reduction. Hahaha!

It was a great talk, and in support of a great dissertation. And a great day.

No comments:

Post a Comment