At morning coffee at OCIW, David Law (UCLA) told us that he has evidence—from the morphology of the Sagittarius stream, mainly—that the Milky Way halo could be triaxial. This is pretty intriguing, because it is the generic prediction for dark-matter halos, as I understand is (substructure notwithstanding). He reminded me of the paradox that the Sagittarius stream must—in any model—be very old, which goes against the idea that galaxies are generically building up their mass through mergers.
Juna Kollmeier (OCIW) and I spent some time today talking about the ways that high-velocity (think nearly unbound or unbound) stars could be used to constrain properties of the Galaxy. Of course it got me thinking all Bayesian; it has some interesting similarities to the Oort problem, but with the added interesting and very useful (for inference) complication that the orbital times can exceed the stellar lifetimes. If you assume that stars are born near the center of the Galaxy, this could probably be made into a very interesting kind of inference.
After a day spent writing (yet more on my fitting paper) and arguing (with various people at OCIW, where I am spending these two weeks), Mario Mateo (Michigan) gave a nice and thought-provoking talk about dwarf galaxies, focusing on dark-matter content and host halos.
Spent the long weekend writing on the line-fitting and reading on Bayesian methods. The defense of Bayesian relative to sampling theory or frequentism in the MacKay book is far better than that in the Jaynes book, because MacKay uses much better frequentist examples, cites specific sources for the frequentist solutions he gives, and tries to find some common ground. He points out (and I agree) that frequentist p-values and chi-squared tests are very good for discovering problems.
I wrote words about bootstrap and jackknife in my nascent line-fitting document. Bovy made some fake data for demos.
I read the Navarro et al paper on Arcturus, the moving group that is in the Solar Neighborhood but lagging the Solar motion (and the Local Standard of Rest) by 100 km/s. Navarro et al base their conclusions—that the stream is a disrupting galaxy that has contributed stars to the disk—on a number of things which are not at all consistent with Bovy and my reconstruction of the local velocity field. That is, most of the stars that Navarro et al associate with the moving group are not part of the moving group found in our analysis, though they certainly are lagging stars, and the group we find has a velocity dispersion way too small to be the remnant of anything massive. Then again, we do find a large population of lagging stars outside the Arcturus group, many of which may be interesting.
Aukosh Jagannath (NYU) and I worked on his code to generate perturbed cold tidal streams, perturbed away from their orbits by passing substructure. I am hoping we can, in some sense, image the substructure with these streams, if they are abundant enough. We worked on generating visualizations to convince us that the signal is there, in principle.
Adrian Price-Whelan (NYU) and I worked on the figures and text for our short paper on the bandwidth with which images should be communicated and saved—or the smallest difference in pixel values that should be permitted by the representation. We find that there should be one or two bits spanning the noise, and that is all. All of the information, on sky level, variance, source positions, and even on sources that lie below the noise level (!) is preserved even when there is only two bits spanning the noise. This has implications for experimental or instrument design, and especially space missions. I don't think our results are really new; many of them exist in the stochastic resonance literature.
In my writing on fitting a straight line to data (!), I confronted today the issue that Bayesian analyses return distributions not answers and the investigator is forced to decide how to present those results. I am a strong supporter of returning a sampling from the posterior distribution, but that is expensive when your model has thousands of parameters! The standard practice is to return the
maximum a posteriori values, which is the quasi-Bayesian answer to maximum likelihood. Not sure how I feel about that; I think it may only make sense at high signal-to-noise, which, as I have commented before, only holds when it doesn't matter what you do.
Charlie Conroy (Princeton) came and spoke about modeling galaxy spectra as linear combinations of stellar populations. He is working towards marginalizing over unknowns and propagating errors, and then finding observations that reduce the magnitudes of the ranges for the unknowns. He shows very convincingly that stellar masses (the simplest thing you might want to get out of such a model for a galaxy) are not secure at the factor-of-two level, because of things like late stages of stellar evolution (which are brief, random, and luminous in many cases).
[Out sick for a few days; hence no posts.]
Bovy and I spent a lot of today discussing distribution functions, in the context of Rix's (really Reid's) masers and the Solar System. This was partially inspired by Tremaine sending us an alternative method to the inference problem we solved before April Fools' Day. The issue of interest is that of modeling distribution functions when you don't know what model space to use. In Bayesian inference there are ways of giving literally infinite freedom to the distribution function and nonetheless getting a reasonable posterior distribution, fundamentally because the method returns not an answer but a distribution. That said, it is not clear anything in this area is practically relevant. One place it might be is Reid's masers, where the paucity of data makes good inference important, and makes complicated methods tractable.
At group meeting today, Guangtun Zhu (NYU) started us on a discussion about morphological and kinematic selection of a pure sample of non-disk-dominated ellipticals using only SDSS data. This is not a trivial problem, and has been solved in an unsatisfying and relatively ad hoc manner up to now. If he solves it, there is a lot that can be done with his sample.
On the weekend I finished reading Jaynes, Probability Theory, which is an extremely long and detailed polemic about probability, inference, and decision theory. It was great!
I had many problems with the book, like its dismissal of alternate positions, and its unfair treatment of competing methods in each situation. It had too much bashing of other people. It also had some dumb things to say about quantum mechanics (did Jaynes never hear about Bell's inequalities?). But overall the book was extremely useful to me and clarifying on issues like noise and uncertainty, the subjectivity or objectivity of priors, and the locations where Bayesian methods involve ad hoc input and where they are mechanistic and driven by the rules of probability theory.
Jaynes takes a strong position that
probabilities are always descriptions of your knowledge: There might be a definite fact of the matter or not, but what you know about it is probabilistic. He then refuses to use the word
probability for anything else! This is a bit crazy, but it appeals to my extreme positions on consistency. So almost anything you or I would call a
probability distribution, Jaynes would call a
frequency distribution. All this reminds me of my careful use of the words
error. As my mentor Neugebauer used to say about our
They aren't errors! If they were errors, we would correct them. They are uncertainties.
In the morning I started to think about my lectures for the IMPRS. I think my first will be on fitting a straight line to data. It turns out I have hours of material on this! I pity the students.
In the afternoon, Glenn van de Ven (IAS) came by to discuss dynamical modeling with Bovy and me. van de Ven has beautiful results on external galaxies and on the globular cluster omega Cen, modeled by what's known as the Schwarzschild method. In the globular cluster, for which he has similar kinds of data to what we have for the Galactic Center, he has evidence for several different dynamical components in phase space; we wondered whether those might be related to the fact that omega Cen also has several different stellar populations.
Today Bovy took and passed (of course) his candidacy exam. He told us about dynamical inference in the Solar System, the Galactic Center, and the local Galactic neighborhood. Congratulations!
Immediately after Bovy's talk we had a special seminar about early Fermi/GLAST results on the gamma-ray sky. The short summary is that they are very sensitive to point sources, and see many new sources; they have found a population of radio-quiet gamma-ray millisecond pulsars; they don't see any very good evidence of dark-matter annihilation, but they do have one resolved source with no Galactic counterpart. This they are not yet releasing, but it looks intriguing.
I don't write papers that begin with the word "Let". However, today Roweis convinced Bovy and me that our joint paper on the method we use to reconstruct distribution functions from noisy and incomplete data should be closer to that style, if we are going to publish it in the stats literature. We also discussed many other things with Roweis, including operations research algorithms, decision theory, and matrix math. It is always great when we get a visit from the master.
In my infinitesimal research time today, Bovy and I discussed the dynamical measurements of the black hole at the Galactic Center. It appears that received wisdom is that by far the best measurement of its mass comes from the one closest star that has completed a full orbit in the 15-or-so years in which it has been observed, and all other measures are deprecated to that one. It also appears that the central mass estimate has creeped upwards throughout that time, so there are unresolved discrepancies among different data sets. Joint analyses of all the data available have not been attempted, to our knowledge.
Lang and I, working towards the Astrometry.net main paper, built an SDSS-based quad index and verification system (using the command-line tools of the Astrometry.net codebase), optimized to calibrate (blind) images as small as those produced by the Hubble Space Telescope. We succeeded in calibrating some images.
We spent the evening—into the wee hours—working on computational photography (treating the output of the Astrometry.net digital camera as scientific data). That's Friday night at Astrometry.net world headquarters! While we hacked, Stumm was being wined and dined by the museum set at the American Association of Museums Annual Meeting for his work on web-2.0 museum interactions with Flickr and Astrometry.net.