Dilip Krishnan (NYU) came over to the CCPP to school us cosmologists in optimization today. The conversation (in the lounge) was wide-ranging, but the most interesting parts of it (from my perspective) were about stochastic-gradient methods in sparse problems.
A stochastic-gradient method (or online method) is one that permits you to look sequentially at parts of your data (not all the data at once) but nonetheless optimize global parameters that affect all of the data. The idea is: You have so much data, you can't fit it all on disk, so you just sample a bit of it, update your parameters according to what it says, throw away the bit of data, and grab the next bit. Iterate to convergence. Foreman-Mackey and I have been investigating this in the context of The Thresher.
A sparse problem (in my parlance) is one in which most parameters only affect a small fraction of the data, and most data are only affected by a small fraction of the parameters. It is not obvious to me that the stochastic-gradient methods will work (or work well) on sparse problems, at least not without modification.
I promised Krishnan that I would deliver to him an example of a realistic, convex, sparse problem in astronomy. Convexity ensures that we have no disagreements about what constitutes success in optimization. It will provide a sandbox for investigating some of these issues.