2012-03-19

Zolotov

Zolotov (Hebrew) returned to NYU for a day for catching up. She is working on—among other things—the effect of dramatic baryonic evolution (star formation and supernovae) on the dark-matter halos of substructure within galaxies. She finds bigger effects than you might imagine, because although the baryons are a small fraction of each subhalo, they can undergo rapid evolution at very small scales and high densities. I pitched to her the idea of testing in her super-realistic simulations the strong dependence of scale height and scale length on metallicity and alpha-abundance that Bovy, Rix, and others (including me) find in the Milky Way disk (at least in the Solar Neighborhood). She ought to be able to see exactly the same effect in her simulations.

Procrastination on a large number of pressing items has led me to write a long document about probability calculus. I spent a lot of the weekend working on it, with a short diversion to Lang's place, where we talked about decision theory, among many other things. I am not sure it counts as research, but look for it on the arXiv sometime this semester.

2012-03-16

uncertainties are parameters too

The day started with a call with Rory Holmes about our nearly-finished paper on self-calibration of imaging surveys. One of the things we discussed was journal choice. Not an easy one. We also agreed to move to git and cloud-based code hosting.

Later in the day, Fouesneau (UW), Weisz (UW), and I worked on issues of completeness, cluster membership posterior probabilities, and radial-profile fitting. On the latter point, Fouesneau pointed out that the errors in the measurements of the radial profiles (which are photometric) are probably under-estimated because they are generated by shot noise not in the number of photons, but in the number of stars, which have a huge dynamic range in brightness. He doesn't trust the uncertainties that are reported to him. Not having the ability to re-do the error analysis, we discussed the various things that the uncertainties could depend on, among the data outputs and model inputs we do have. Once we had that written down, we realized that we could just parameterize the dependence of the uncertainties on the measured and model quantities, and fit for them. We pair-coded that in Fouesneau's sandbox and it worked! So we might be treating uncertainties in the way they deserve. It reminded me of conversations I have had in the past with Brendon Brewer (UCSB).

2012-03-15

young clusters

I continued working on various aspects of the young clusters in M31 with Fouesneau (UW), Gordon (STScI), and Weisz (UW). We discussed at length how the mixture model works, with a mixing that is position-dependent. We figured out, conceptually, how the cluster properties in the mixture model are constrained strongly by stars that fit the cluster well, and not by stars that don't. We agreed that we need more pedagogical stuff on mixture models, for the paper, for talks, and for the world. I worked with Fouesneau to debug some code that made it seem like emcee was not working; in the end emcee was not the problem. We had a great Korean lunch on 32nd Street before sending Gordon off at Penn Station, and then spent the walk home re-discussing completeness. We decided, tentatively, to do a naive completeness estimation and just note the limitations. Doing the Right Thing (tm) is beyond our current scope, and probably won't change our results very much (for reasons we can quantitatively argue).

2012-03-13

completeness

Lang arrived for the day, and after a re-cap of what we are trying to do with the PHAT data, the issues of completeness came up. We had a lively discussion with Fouesneau (UW), Gordon (STScI), and Weisz (UW) of how we should measure the completeness and how we should use those measurements. We realized that there are so many subtleties, there is a paper that could be written on this alone. It is hard to measure and easy to use wrongly! Coding continued, and in the evening, Lang and I re-discussed our first papers from the Tractor. I promised to email Lang minutes of that chat.

2012-03-12

young, PHAT clusters

A chunk of the PHAT team—Dalcanton (UW), Fouesneau (UW), Gordon (STScI), Weisz (UW)—arrived today to talk about stellar SED fitting and propagation of uncertainties therein to quantities and studies of interest. (For those of you who do not read and remember absolutely everything I write here, the PHAT project is a Dalcanton-PI six-band imaging survey over a large fraction of the M31 disk to create a catalog of tens of millions of stars and do a lot of the science you can do with that.) Today we made plans for the week-long sprint, which appear to be to use the outputs of SED fitting for every star on a big parameter grid and inputs from a model of the stellar population in a cluster, plus some crazy integration, to build a marginalized likelihood for the cluster parameters. Now everyone is coding like crazy. I think we surprised ourselves when we decided we would work in IDL and not Python. Crazy, but pragmatic.

2012-03-10

informal scientific writing

On the plane home, I answered a detailed question from Jannuzi (NOAO), in the form of a few-page typeset document, started work on an answer to a detailed question from Abate (NOAO), and wrote a few pages about probabilistic reasoning for the next installment in the Data Analysis Recipes series. My conversation yesterday with Jannuzi convinced me that it would be useful to review the basic operations available in measure theory (probability calculus). Most of the things we do here at Camp Hogg in data analysis boil down to simple applications of simple operations in measure theory; it isn't hard, but it is powerful, in the sense there are a few rules that—once you learn them—make every novel probabilistic inference straightforwardly discoverable. I also want to advertise the dimensional-analysis way of thinking about probability theory, which has been very useful to me (and Rix too, I think).

2012-03-09

photometric redshifts

I spent the morning in two conversations about photometric redshifts, one with Buell Jannuzi (NOAO), who is using them to measure evolution of clustering of massive galaxies, and one with Alexandra Abate (Arizona), who is using them (and anything else she can find) to estimate the redshift distribution of weak-lensing galaxies in future LSST data. Abate and I figured out that extremely low signal-to-noise spectroscopy could in principle be very decisive, because for objects that have a wide range of redshifts permitted by photometric redshifts, there is usually a strong dependence of inferred redshift on inferred SED (and therefore predicted emission lines). I promised both Jannuzi and Abate that I would write notes on the airplane home.

In the afternoon, Mario Juric (NOAO) and I talked about all things LSST, including how we can go beyond catalogs to data products that contain more probabilistic and noise-model information. As my loyal reader knows, I think that if all LSST produces is a catalog and some images, it will not achieve many of its most valuable goals.

2012-03-08

Tucson firehose

After my Colloquium here at Steward Observatory plus NOAO, Buell Jannuzi (NOAO) told me that I had, in a one hour talk, perfectly simulated a AAS session: My talk consisted of five ten-minute talks, each of which was intriguing and depressing in a different way! I took that as a compliment.

I also had nice conversations with Todd Lauer (NOAO) about image processing and data modeling in general. We discussed, among many other things, what the relationship might be between a model of a full set of overlapping images and a co-add of deconvolved images (with suitable priors, I presume). These two things might look very similar, if the deconvolution is light and the variations among the images is not too large. We also discussed when it makes sense to do the Right Thing (tm) when the Simplest Thing (tm) is much easier to understand and use.

At dinner I spoke with many of the graduate students, which was a pleasure. I learned that Ken Wong (Steward, of PRIMUS fame, among other things) has executed a pie-in-the-sky idea of Ann Zabludoff's (Steward): Find lines of sight in SDSS that are highly likely to have high magnification from lensing by superimposed galaxy groups and clusters. This is not trivial because the relationship (at group and cluster scale) between dark matter and galaxies is not trivial.

2012-03-07

how many stars have planets?

Foreman-Mackey and I interacted with Eric Ford (Florida) about some of the issues in inferring the total planet population from the tiny, tiny fraction that produce observable transits. Ford sent us two great, long, detailed emails filled with ideas and advice. It is very interesting to me how email communication is such an important part of the scientific process; Ford's emails to me are always so good they should really count as publications on his CV!

We are interested in these problems in part because they are very good examples for using probabilistic graphical models in astrophysics, but also in part because there are some versions of these questions that might be easy to answer right away, with data straight from the literature. It gets hard if we really have to build a model of the Kepler selection function, but we are thinking about problems we can do without doing that (at least at first). We also have the GALEX eclipses I have been blogging about; these also lead to interesting problems, although because we have almost no period information, the inferences we can do are limited.

One interesting thing Ford pointed us to is emerging hints that multiple-planet systems violate the null hypothesis of being built by multiple independent draws from the one-planet systems. Obviously that null hypothesis must be wrong for dynamical reasons, but apparently it is already strongly violated in the data in hand. My intuitions says that is something worth checking out.

2012-03-06

the electromagnetic fields in cameras

I spent a very enjoyable hour in a coffee shop with Fergus and applied mathematicians Leslie Greengard, Charlie Epstein, and Mike O'Neil. We discussed the possibility that Fergus and I might do better on our coronograph problems by doing real-live modeling of the scalar or vector wave equations in a physically realizable device. I have an intuition that we will, but I also have a pretty good sense that we can't really model the full electromagnetic field on every surface; that's insane. Some great ideas came up in the discussion. Greengard pointed out that the convolutions (think Green functions) I am doing numerically can be much, much faster with FFT-like techniques (and Greengard should know). Epstein started out by saying five-meter mirror and micron wavelengths; geometric optics isn't good enough for you?, which is a pretty fair comment, and then followed that by saying that there is a next order to the geometric optics approximation. It is a well-defined limit, after all, so what we think of as being geometric optics (that beautiful theory) is really just the first term in an expansion. That's cool, although the implication is that the next term in the expansion is ugly (not in the textbooks as Greengard said). Epstein also noted that inference of the phase from an intensity image is probably not a good idea. The conversation was great; but unfortunately it didn't convince me to give up on this crazy idea. Tonight I have to polish up my code and send it to Greengard for re-factoring to non-stupidity.

One paradox still remains in my mind after it all, and it is this: Heuristically (yes, very heuristically) there are trillions of wavelength-squared cells on the entrance aperture (or primary mirror), but heuristically (yes, yes) there are only hundreds or thousands of speckles on the focal plane that get significant illumination. So doesn't this mean that we don't have to model the entrance aperture at full wavelength resolution to precisely model any real speckle pattern? And isn't that somehow odd?

2012-03-05

the Higgs, stellar modeling

Neil Weiner gave a great brown-bag (all chalk) talk about the Higgs at lunch today. He started from scratch: He started at the Lagrangian in field theory and ended up saying what the implications are for new physics (especially supersymmetry-like theories) of various different Higgs masses! And all in 50 minutes. It was a masterpiece, and hilarious to boot. Ask him for his joke about string theorists.

Before and after that, Hou showed us the results of fitting a stochastic stellar oscillation model to K-giant radial velocity data. We are hoping to show that including a physical model for surface variations will improve the results of exoplanet fitting. We can show this for fake data; now onto real data.

2012-03-04

probabilistic modeling of eclipses

Here is a GALEX eclipse with 64 posterior samples from a probabilistic model superimposed.

The star looks slightly variable out of eclipse because there is a varying background superimposed on the stellar light; this is part of the model too if you are doing proper Poisson likelihoods (as we are). Thanks to Foreman-Mackey for the awesome emcee sampler and Dave Spiegel (IAS) for some great (because it is so simple) transit-modeling code.

Not surprisingly, there is some degeneracy between the inferred impact parameter and the inferred ratio of radii of the star and companion. But it looks like we will be able to say things about the companion radii.

2012-03-02

measuring bright galaxies, foundations of math

A few weeks ago I reported on my safari to Philosophy. Today, in the Astro seminar at NYU, Tim Maudlin (NYU Philosophy) went on safari to Physics. He works on the question of why mathematics is useful in describing the physical world, and this has led him to the basis of mathematics, or really the basis of the mathematics that is used in physics. He finds (remarkably to me) that if he builds topology (which is really the continuity structure of space or spacetime) on the properties of one-dimensional fundamental objects rather than open sets or neighborhoods, he gets some aspects of the causal structure of spacetime for free. We think (often, informally) of the causal structure as coming from the metric, but Maudlin finds that it can come in far earlier than that if we replace or transpose (in some sense) the foundations of topology. Crazy stuff, and a very lively seminar. My kind of Friday afternoon. Lunch was pretty hilarious too; Maudlin has at his fingertips many paradoxes that get at controversies about probability and information, related to the anthropic principle and the like.

In the morning, Mykytyn showed me that he can fit the intensity images of big galaxies, even in the presence of bright stars, even when those galaxies span different fields taken on different nights, and subsets of the images come from different photometric bandpasses. We are very close to having a system that can re-measure all the (very, very) bright galaxies in SDSS. That could have big impact.

2012-03-01

talk at UMass

I spent the day at UMass Amherst where I chatted with various people and gave a seminar on finding the dark matter with snapshots of tracers in phase space. I talked a lot about Gaia data but at dinner afterwards someone pointed out that I never said what Gaia actually is! After my recent trips and interactions I am starting to forget that here in the US no-one knows about this great mission.