Today was the first day of the 2019 Math+X Symposium on Inverse Problems and Deep Learning in Space Exploration, which is a meeting to bring together mathematicians and domain scientists to discuss problems of mutual interest. I learned a huge amount today! I can't summarize the whole day, so here are just a few things ringing in my brain afterwards:
Sara Seager (MIT) and I both talked about how machine learning helps us in astrophysics. She focused more on using machine learning to speed computation or interpolate or emulate expensive atmospheric retrieval models for exoplanet atmospheres. I focused more on the use of machine learning to model nuisances or structured noise or foregrounds or backgrounds in complex data (focusing on stars).
Taco Cohen (Amsterdam) showed a theory of how to make fully, locally gauge-invariant (what I would call “coordinate free”) deep-learning models. And he gave some examples. Although he implied that the continuous versions of these models are very expensive and impractical, the discretized versions might have great applications in the physical sciences, which we believe truly are gauge-invariant! In some sense he has built a superset of all physical laws. I'd be interested in applying these to things like CMB and 21-cm foregrounds.
Jitendra Malik (Berkeley) gave a nice talk about generative models moving beyond GANs, where he is concerned (like me) with what's called “mode collapse” or the problem that the generator can beat the discriminator without making data that are fully representative of all kinds of real data. He even name-checked the birthday paradox (my favorite of the statistical paradoxes!) as a method for identifying mode collapse. Afterwards Kyle Cranmer (NYU) and I discussed with Malik and various others the possibility that deep generative models could possibly play a role in implicit or likelihood-free inference.
There were many other amazing results, including finding seismic pre-cursors to landslides (Seydoux) and using deep models to control adaptive optics (Nousianinen) and analyses of why deep learning models (which have unimaginable capacity) aren't insanely over-fitting (Zdeborová). On that last point the short answer is: No-one knows! But it is really solidly true. My intuition is that it has something to do with the differences between being convex in the parameter space and being convex in the data space. Not that I'm saying anything is either of those!