I spent the morning working on Tsalmantza and my HMF method paper. I signed off and in the afternoon she finished and submitted it. I feel pretty good about that.
NYU undergraduate David Mykytyn has been helping Lang and I get the Tractor working on big galaxies, with the thought that we could model big galaxies with the same machinery that we are using to model small ones (and by big I mean angularly big). He got everything working in record time but the model is so simple that it is having trouble dealing with all the real-world craziness at the centers of real, huge galaxies (think dust lanes etc). This on the path to a new Atlas.
In a many-hour delay in Madrid airport at gate U58, I re-hashed the CMML workshop for a group of New-York-bound machine learners who were in other workshops, but who (today) had nothing but time (thanks
mechanical issue with airplane). I then tried to do some of the math that Marshall and I had been waving around yesterday. I think we might have some conceptual issues. The plan is sound, way sound, but we might not be able to transfer probabilistic information with quite the fidelity we had in mind yesterday.
One thing I didn't mention about the awesome day yesterday is that Marshall and I scoped (not
scoped) two papers on PanSTARRS-like data. One involves making a super-simple catalog from the imaging, a catalog made up of a mixture of Gaussians (and nothing else, not even PSF determination). Marshall has the intuition that this catalog—not even really a catalog but more like a lossy but responsible description of the mean properties of the imaging—could be used to transmit to end users far more information than is transferred by traditional astronomical catalogs. We were on a roller coaster of
good news, bad news until we cracked it in a nasty cafe across the street from the Grenada bus station.
Today was the first day of the workshops at NIPS, and the day of the Cosmology Meets Machine Learning organized by a group led by Michael Hirsch (UCL and MPK Tübingen). What a day it was! The talks, by astronomers doing cosmology with sophisticated machine tools, were edutaining, with (among others) Lupton doing his best to pretend to be curmudgeonly (okay, he does have a point that some of the stuff I say is not all that practical), Starck showing amazing decompositions of Planck-like maps, and Refregier doing his best to alarm us about the difficulty of the cosmological weak lensing problem. In between these talks were shorts by the poster presenters; all good and all high bandwidth in their four-minute spots. A standout for me was Kaisey Mandel and his hierarchical probabilistic model for the type-Ia SNe, making the cosmological constraints more precise by hierarchically learning the priors over the nuisance parameters you need to marginalize out if you want to do things right!
While many left to ski, Marshall declared the afternoon break to be an
un-workshop in which workshop topics self-assembled and self-organized. This evolved to two big un-workshops, one on probabilistic graphical models, with Iain Murray doing the heavy lifting, and one on blind deconvolution with Hirsch throwing down. Hirsch showed some devastating results in blind and non-blind deconvolution, including (in the style of Rob Fergus), outrageous ability to compensate for bad hardware or bad photography. Outrageous.
Despite all that, it was the PGM workshop with Murray that—and I am not exaggerating here—was possibly the most educational ninety minutes of my post-graduate-school life. After some introductory remarks by Murray, we (as a group) tried to build a PGM for Refregier and Bridle's weak-lensing programs. Marshall insisted we use the notation that is common in the field and keep it simple, Murray insisted that we do things that are not blantantly wrong, Stefan Harmeling provided philosophy and background, especially about the relationship between generative modeling and probabilistic modeling, Lupton tried to stay as curmudgeonly as he could, and at the end, Murray broke it all down. It wasn't just science, it was like we were starring in an HBO special about science. We realized that PGMs are very valuable for de-bugging your thinking, structuring the elements of your code, and, of course, making sure you write down not-wrong probability expressions. Aawww Yeah!
At the end of the day, Marshall moderated a (huge) panel, which covered a lot of ground. The crazy thing is that we had some important points of consensus, not limited to the following: (1) As a pair of overlapping communities, our best area of overlap is in structured, physics-informed probabilistic modeling. Many cosmologists are stuck on problems like these, many machine learners have good technology (things like sparse methods, online and stochastic methods, and sampling foo). Neil Lawrence pointed out that the machine learners got their Bayes from astronomers Gauss and Laplace. Now the astronomers are asking for it back. (2) We should be setting up some simple challenges and toy problems. These make it easy to draw machine learners into the field, and help us boil our issues down to the key ideas and problems. That's Murray's big point.
Hirsch, Bridle, Marshall, Murray, and everyone else: Thank you. Absolutely cannot understand why Sam Roweis wasn't there for it. I never really will.
I went to (most of) the NIPS 2011 talks today (my first day at NIPS). Unlike the AAS meetings, the talks are very highly vetted (getting a talk at NIPS is harder—statistically speaking—than getting a prize fellowship in astronomy) and there are no parallel sessions, even though the meeting is almost as large as the AAS (NIPS is 1400; AAS winter is 2000-ish). One standout talk was by Laurent on the strange encoding of olfactory information in insects (and, apparently, humans, which are similar in this respect). There is a part of the olfactory system that looks like a sparse coding of the input, which looks (to my eyes) to be a very inefficient use of neurons. Another was by Feldman on
coresets, which are data subsamples (imagine that you have way too many data points to fit in RAM) plus associated weights, chosen such that the weighted sum of log-likelihoods of the coreset points is an epsilon-close approximation to the full-data log-likelihood (or other additive objective function). This concept could be useful for astrophysics; it reminds me of my projects with Tsalmantza on archetypes.
On the bus to Sierra Nevada in the afternoon, Marshall and I tried to scope out our
next paper. I put that in quotation marks because we don't have a very good track record of finishing projects! We are going to do something that involves image modeling and approximations thereto that can be performed on catalog (database) quantities.
I have been reading about graphical models for a few days here and there. So on the plane (to NIPS 2011) I tried to draw a graphical model for a large-scale structure and weak-lensing jointly supported model of the density field and cosmological parameters. I think I am close. Marshall and I and anyone who can stand it are going to unconference on Friday and try to work it out.
Guangtun Zhu (JHU) stopped by this week (welcome back!) and came for a chat about modeling spectra, in particular quasar spectra that might have absorption-line systems redward of Lyman alpha. Most catalogs of absorption systems involve significant hand-vetting; we discussed methods to ameliorate or eliminate the by-hand aspects of the problem. Guess what I advocated? Justified probabilistic modeling of the spectra, plus some sensible priors.
It is not clear it counts as research, since I did it purely for fun, but I wrote a recursive code to build this kind of space-filling curve. It is slow, but it works for any image (array) that is 2n×2n. If it has any research application, it is in image compression, but I officially don't care about that subject (this paper notwithstanding; it isn't about image compression; it is about information!).
I am so proud of this; check out how scale-free it is: The top-left 4×4 block has the same two-dimensional pattern for the ones digit as the whole chart has for the sixteens digit, but with the
pixels being sixteen-cell blocks.
Price-Whelan returned to NYU for a day (from distant Columbia, where he is now a grad student) to discuss methodologies for some of the projects he is thinking about with Palomar Transient Factory imaging data. There are lots of things possible; like most big projects, PTF creates more possible projects than there are collaborators to do them; this is why astrophysics is a great field! I argued that he should use the things we have been thinking about with respect to image combination and source detection to detect—and then analyze—the sources below the single-exposure detection limit. Any project PTF has done at high signal-to-noise can be done for far more sources additionally if you are willing to work at lower signal-to-noise (and at image level rather than catalog level).
As my loyal reader knows, I am strongly against modeling non-integrable systems with integrable potentials. And guess what: No galaxy in any kind of realistic dark-matter mass distribution could ever have an integrable potential! During a meeting today (and yesterday also, which perhaps doesn't count according to the implicit parts of The Rules), Fadely, Foreman-Mackey, and I talked about how to generalize things like Schwarzschild modeling to non-integrable potentials and proper probabilistic inference. Glenn van de Ven (MPIA) does these things for a living, so we wrote him email and he sent back some great suggestions for how to proceed. Now the question is: Should we implement something, even if it means reinventing a few wheels? It didn't take us long to decide: Yes!. Ah, Camp Hogg prefers engineering to science these days. One set of ideas I am excited about is how to write down a prior over phase-space distribution functions when you don't have an integrable-potential
action–angle language in which to write it. I love that hairball of problems.
After I used my five fingers to put my foot squarely in my mouth on the APOGEE Collaboration mailing list (yes folks, apparently you should be careful with that
Reply All button), I was treated late in the day to a couple of excellent talks about NYU efforts that border on astrobiology; not astrobiology per se but close: Paul Chaikin (NYU Physics) talked about his work with Pine and Seamans to make artificial structures out of DNA or small particles painted with DNA or the like that can self-assemble and then self-replicate. Bud Mishra (NYU Math and Biology) talked about simulations of the interactions that lead to multi-cellularity, with the thought that multi-cellularity might be inexpensive and generic in the path from first life forms to complex ones.
One of Mishra's most gripping (ha ha) points was about the number of fingers we have: Is five a random accident or is it something you would expect to see on other worlds? He pointed to a nice math result of his that shows that for frictionless fingers to hold any arbitrary solid object under load, you need between 7 and 12 fingers. Given that fingers come at a cost, and that there is friction, five might be very close to optimal for a wide range of utility functions! That's an idea I will be thinking about for a while. I also got him to say a bit about higher dimensions: He said my problem wasn't crazy because an articulated object is like a solid object in higher dimensions!
Douglas Brenner (AMNH) dropped by for a discussion of spectrum extraction, related to the coronography stuff we have been working on. I pointed him to this instant classic, a paper much discussed a few years ago in this forum (for example here). At lunch, Muna talked about software development for stellar population synthesis, which could have a big impact in the large-spectroscopic-survey world we are currently in. In the afternoon, Foreman-Mackey produced color-color plots for stars in his re-calibration (self-calibration) of SDSS Stripe 82 imaging. They are looking very good.
At the end of yesterday, there was a discussion of astrobiology at NYU, and then this morning I had email from Rix about a probability calculation for p(habitable|data) where "habitable" is the bit "yes this exoplanet is habitable" or "no it isn't" and "data" is whatever you have measured about an exoplanetary system. He was inspired by this paper. We discussed briefly by email; one of the issues (of course) is that many of the parameters that might determine habitability do not even indirectly affect the observable (at present day) properties of the exoplanet. That's a nice problem; it reminds me of my conversations about extraterrestrial life: Prior information informs the current debate much more than the data do; it is a playground for Bayesian reasoners.
In an all-talk day, I talked with Lang about next steps on our various slowly moving projects; I talked with Sjoert van Velzen (Amsterdam) and Foreman-Mackey about large surveys; I talked with Fadely yet again about the scope for paper zero on classification. This was a highlight: This time I think we have done it! The scope will be showing that hierarchical beats likelihood tests no matter how bad the models. In the afternoon, Nissanke (Caltech) gave a seminar about gravitational radiation detection, particularly in the light of coincident electromagnetic signals. At the end, discussion devolved into the insane data-secrecy policies of LIGO, which I abhor. Occupy LIGO!