In an extremely social day, James Bullock (UCI) convinced us that there is no trivial solution to the Milky Way substructure problem: There don't seem to be any galaxies in the 20 to 50 km/s subhalos, while there are plenty in the 10 to 20. What's up with that? Ross Fadely and I discussed generalization of our photometric noise model, because we are getting low probability for every star or galaxy model we are trying on the COSMOS data. In the afternoon, Eric Ford (Florida) gave a free-form audience-driven seminar on Kepler and its results, which led to some great in-talk and post-talk discussions. We all agreed that all stars have planets (because given Kepler's limitations so far 30 percent = 100 percent). We disagreed on what the point was for finding planets in the habitable zone. My plan: Call whoever lives there and ask them what the dark matter is.
Eric Ford (Florida) showed up today, and we spent a good chunk of the day arguing about possible improvements to exoplanet data analysis. We slowly built up a suggestion—possibly a very impractical one—for modeling the intensity variations with position and time on the exoplanet-hosting star surface. These variations make the lightcurve vary, and also (because the exoplanet blocks a tiny part of the variable surface) the depth of the eclipses. Bovy and I have a Gaussian Processes formalism that might conceivably take care of these issues, but the probable levels of required computation are a little scary.
Argh time is annoying, if you care about it at the sub-second level! Lang and I spent a chunk of the day converting from GALEX spacecraft clock time (which is more-or-less synchronized to GPS time) to barycentric time (which is the time you would be keeping if you were a fictional Newtonian observer at the Solar System Barycenter). This is necessary because we care about eclipse timing at the sub-second level, and the light-travel time from Earth to the Barycenter is minutes. The light-travel time from the center of the Sun (Heliocentric frame) to the Barycenter is seconds. We spent a bit of our time trying to robotically exercise the JPL Horizons ephemeris, which was successful but challenging.
Lang and I did final pair-coding work for our final list of eclipsing-ultraviolet stars in the GALEX time stream. We added one final noise-model hypothesis (my loyal reader knows that we are making a pure list by competing our positive detections against alternative
noise or false-positive hypotheses with likelihood tests) and thereby pared down the list to a few sources that are all either real, or not real for reasons knowable only to those (like Schiminovich) more familiar with the spacecraft.
Steve Boughn (Haverford) gave a very controversial and very thought-provoking talk about the fact that the graviton is, essentially, undetectable. A nice point, and concerning for the testability of any non-trivial ideas in quantum gravity.
Lang, Stumm, and I did final rankings of our Google Summer of Code applications. It was hard work; we got lots of great applications. This might not count as research (by The Rules at right), but it is a really great activity: Google pays for coders to contribute to Astrometry.net! If you are open-source (and run on a shoestring, like we are) get with the program.
In our transit search in the GALEX data, we turn up large numbers of false positives. Some of these are other kinds of stellar variability and some are various kinds of detector or spacecraft issues. Normally we would cut these with some kind of heuristic filtering (as in
look at this one; it doesn't look right), but Lang and I have been trying to filter them by creating a quantitative hypothesis or model for each one, and then doing likelihood (or posterior probability) ratios to rule them out more objectively or quantitatively. It has been working well; in fact with the sample as large as it is, it is faster to write the relevant code than it is to make all the relevant judgements. It doesn't remove judgement, of course, because which models we make depends on which artifacts we see or anticipate.
Hou and Gooodman are working with me on exoplanet fitting. We had our weekly meeting today, where we discussed his radial-velocity oscillation model—a model for confusing non-exoplanet sources in the data stream—and then moved on to the question of whether we could beat the Kepler team at its own game. I have some ideas, but I don't think we could beat them without quite a bit of hard, hard work.
Rebecca Bernstein (UCSC) spoke about globular cluster chemical abundances. Scott Tremaine (IAS) spoke about exoplanet orbit inclinations. It was a great day. Tremaine also challenged me on some Bayesianisms—as he is wont to do—and I told him about my new view that really you should be publishing your likelihood function not your posterior PDF. And I mean the function, not the maximum-likelihood point and some interval around it.
Rebecca Bernstein (UCSC) showed up for a two-day visit. We spent a tiny bit of time talking about modeling spectrographs; se agreed with my position that you can't do the best possible data analysis unless you can synthesize your read-out images. She also noted that we know a heck of a lot about what can be happening inside the spectrograph. That information should come into the model if you want to get as precise as you can.
Again, no research. But I did read the NASA 2010 Science Plan, which includes stuff about Pu 238 production. In the short term, asking for the US to re-start Pu 238 production is good for interplanetary exploration, but if it is not combined with unilateral disarmament (and it isn't), then in the long term it is very, very bad for our security and our position in the international community. Won't we get more science done in the long term if we disarm first, and produce Pu 238 after?
[I think it violates The Rules to talk about politics here; I promise not to do that more than, say, once a year.]
I have no research to report today; in DC serving our nation through a NASA committee. But I did run into my grad-school friend Tom Murphy (UCSD) with whom I chatted about precision measurement. He is one of the world's experts, since he tracks the Moon at millimeter precision (though not accuracy, for interesting reasons).
Lang and I saw the bat symbol shining in the night sky: Schiminovich is at MDM and needs targets! We worked on re-running our transit-detector with improved models: Now we want to remove false positives by competing the eclipse hypothesis against other, more general hypotheses. We pair-coded in all available time today.
Ross Fadely came up for the day to discuss our star–galaxy separation project with Willman. We discussed how it is possible that we can train a star–galaxy separation system without a training set. We can, because we think we have a set of spectra that span anything a star can be, and similarly for galaxies. We will succeed inasmuch as we are right. We are only using our
training set for quality assurance and testing.
In the morning session I talked about modeling, including our Comet 17P/Holmes project (which got some press here and here; my dotastronomy viewgraphs are here). A highlight for me of the talks was Geert Barensten (Armagh Observatory) talking about human observing of meteroids. He showed a ridiculous distribution of meteoric material around the Earth; the detail was beautiful. At the end of his talk he showed experimentally that some meteoroids can be detected by Twitter searches!
We had an afternoon unconference, with so many good things I couldn't decide what to do. In the end I went to the mash-up discussion, which evolved into a discussion of funding, not surprisingly given the bad state of things for so many interesting projects right now (Jill Tarter of SETI noted that the Allen Telescope Array may be forced to shut down this year). We decided to look at crowd-sourcing some long-term funding propaganda.
There were nice talks in the morning showing off some great and useful astrophysics-related engineering. One highlight for me was Thomas Robataille (Harvard) showing off new ADS-related awesomeness. He mentioned the point that interfacing with ADS through command-line tools improves repeatability. Amen to that! Another highlight for me was Thomas Boch (CDS) showing off the next generation of insane CDS tools.
[Note added later: The next day we won a runner-up prize in the Hack Day awards. Some of the submissions were incredible; one of them got press coverage. My favorite hack was a home-built pen-casting system (draw and record voice and drawing in real time to tell a story).]
I arrived in Oxford today for the first day of the dotAstronomy meeting. I arrived too late today for the formal talks (these are only in the morning) but in the afternoon there was an absolutely great, wide-ranging, and detailed discussion of the technical, social, and career issues of huge data and big inference and machine-learning problems in astrophysics. The day ended with another wide-ranging conversation in the pub, especially about what we are all going to do in tomorrow's hack day.
In Lang and my April Fool's post, arXiv/1103.6038, we made use of an affine-invariant MCMC sampler proposed by Goodman and Weare, which has very good autocorrelation properties. That is, it greatly reduces the number of likelihood calls you need to make to reach good results. Today, I worked a bit on (and then signed off on) Fengji Hou's paper on a general implementation of this sampler and application of it in exoplanet discovery.
(By the way, we tried to come out first on April 1, submitting 48 seconds after the deadline. There were seven papers in—or cross-listed in—astro-ph in the first 8 seconds after the deadline. You gotta be fast.)
Ed Turner (Princeton) spent the day at NYU; we spoke about many things, from statistics to reductionism. He gave a nice seminar about abiogenesis—the emergence of life on the lifeless early Earth—which happened very early after the last surface-melting impacts. This implies, at some level, that abiogenesis at Earth conditions ought to be fairly likely. But that is complicated by anthropic-like issues and the single-number statistics involved. He applies Bayes' theorem and some uninformative priors to show that the data (or datum, really) does imply, weakly, a high rate of abiogenesis at Earth conditions. Turner's talk was followed by discussions with Bovy and Foreman-Mackey about projects.