[Posting has been slow because internet at AISTATS was nearly nonexistent.] At AISTATS days one and two I learned a great deal. There were interesting contributions on automatically adapting MCMC proposal distributions, on computing marginalized likelihoods using densities of states, on model-selection criteria like AIC and so on, and on topological structures in data sets.
In addition I spent some time talking to people about stochastic gradient, in which you optimize a function by taking random data subsamples (think, perhaps, images coming from the telescope) and for each one compute a gradient in the objective function and take a small steps towards the best fit. These are methods for optimizing models on enormous data sets. It turns out that there is a lot of freedom (and a lot of heuristic ideas) about how to set the step sizes. In my nascent lucky imaging project, I am finding that I can get very different results and performance by choosing very different step sizes.
Various posters and talks at the meeting reminded me that there are versions of sparse coding that are convex. That is, you can do matrix factorizations like Tsalmantza and my HMF method, but with additional constraints that try to set most of the coefficients to zero, and you can do that wihout breaking the convexity of the coefficient-fitting step (the a-step in our paper on the subject). This could be very valuable for our next-generation projects.