The only real research I did today was discuss two-dimensional measures of three-dimensional galaxy clustering with Antara Basu-Zych (Columbia). She measures what we usually call w(theta); I measure what we usually call wp(rp); we figured out the relation between these, which depends on the angular diameter distance and the volume per solid angle (that is, an integral of the volume element over redshift). The main subtlety is to decide whether you want your wp(rp) in comoving or proper units; this can get confusing because almost all volume calculations that are commonly done are done in comoving, whereas transverse calculations tend to be proper.
Barron cornered me and we finished the blind date draft. Blind date is the project in which we determine the date at which a photographic plate was taken by comparing the positions of stars to those in a catalog with proper motions.
Reinhardt Genzel spoke at Pizza Lunch about fully resolved galaxy kinematics at a redshift of 2. He argued that he could see disks in formation and evidence for secular creation of bulges. I am not sure I agreed with all the conclusions, but the data—spatially resolved infrared spectroscopy from VLT—were incredible.
The NSF proposal went in.
The proposal did not turn out to be nearly done, and Fergus and I re-wrote it on the weekend and today. But we must submit tomorrow, so the end is in sight.
I spent a bit of time following up some of the z-band-only sources in the SDSS Southern Stripe. Most of them are spurious in one way or another. The dominant source of spurious sources is artificial satellites and other fast-moving objects. In particular, some of the strangest sources Lang had found so far turned out to be individual blinks from a blinking artificial satellite.
Rob Fergus and I spent Thursday and Friday writing an NSF proposal that is interdisciplinary between computer science and physics. In the process, we worked out a large number of ways in which our two research programs overlap technically and intellectually, in the areas of image processing, geometric hashing, fast lookup, indexing into enormous data sets, and operation of image-search web services.
Moustakas gave the group meeting talk at NYU on Friday. He spoke about galaxy evolution as inferred from chemical abundances.
Wu, Schiminovich, and I prepared our lower-priority AORs for our statistical Spitzer program.
Zolotov and I were reminded of the principle—espoused by Mierle—that in a proper coding environment you spend much more time and code on testing your implementation than on the implementation itself. We worked out a framework for testing her halo shape-measuring code.
Zolotov and I worked on measuring the shape of the stellar halo, by making statistics based on the positions of halo stars. Unfortunately, all the methods in the literature are extremely ad-hoc. None of them optimize a well-justified scalar objective, which means that none of them are good measures of anything!
In the morning, Roweis gave our group meeting, with a nice talk about methods for approximating very large, non-square matrices. He included a method for approximating the rows with a small number of archetypes, subject to various kinds of constraints; we are thinking about doing this to make a possible version 3.0 of kcorrect.
In the afternoon, Lang and I looked up our very high proper-motion red stars in the 2MASS catalog, meaning we added code to our image-modeling code that automatically looks up every single source we analyze in 2MASS. (In fact, since Lang and I built this system with integrated data analysis, web pages, and database, the 2MASS look-ups happen every single time you reload each object's web page!) The brighter fast-moving stars are there, at positions that confirm the motions we measure.
Roweis gave the NYU Computer Science Colloquium, and he spent about 3/4 of it on Astrometry.net. The talk was great, as were the questions at the end; they were totally different from the ones I get when I talk about the project.
Lang and I have found some high proper-motion, very red stars in the SDSS Southern Stripe. Zolotov and I spent some time talking about what they might be useful for, and what follow-up to do. They appear to be halo stars, because they have small parallaxes, which means they are moving at very high linear velocities; they may constrain the total mass of the Milky Way.
Zolotov passed her candidacy exam today, after a very nice presentation on issues with understanding the formation of the Milky way, theoretically and observationally.
Lang and I finished the parallax (and proper motion) code, fitting the multi-epoch images from the SDSS Southern Stripe. It works. Time to write. I hope we confirm my high proper-motion, very red objects.
I plotted the UV–optical colors of the minor planets we have found with GALEX. They are very, very red; redder even than the Sun by quite a bit. No surprise there. Are there any minor planets in there that are previously unknown? If so, we get to name them, right?
In a pair-coding session (using Skype and unix screen), Dustin and I figured out that the numerical Python chi-squared optimizers are very brittle and unstable. After expressing a lot of astonishment that there are no absolutely hands-off, rock-hard, never fail optimizers out there, we switched to a C-based Levenberg-Marquardt package that Mierle recommends. Now we have chi-squared fits that work, at least, but it took several days of trial-and-error to first figure out what package to be using and second tune parameters to get it to work reliably on the problem at hand. This doesn't seem right. Surely there is an engineering team out there that can solve this problem once and for all?
I spent most of the productive part of the day confirming that our statistical counterpart associations between GALEX and SDSS match the detailed figures published by the GALEX team here and here. They do, although we have more outliers, in part because we are being too liberal with the input catalogs. The thought of figuring out and applying conservative cuts drove me to outlining the
GALEX but not SDSS paper and improving the output of the code I wrote to automatically compare the GALEX orphans against the Minor Planet Checker.
After getting annoyed with some of the output of my statistical counterpart association stuff, I went back to looking at the no-SDSS GALEX sources, especially the minor planets. I hope to have a full list by the end of the day tomorrow.
I spent what could have been tourism time in beautiful Halifax, and all my time in airports, on airplanes, and in ground transportation facing the degeneracies with which I have been plagued. I decided that the real
prior we have is that we expect the angular distribution of true matches to be monotonically decreasing to larger angular radius, and we expect it to decrease in particular ways at very small scales and at large scales. I figured out how to enforce this in my expectation-maximization code, spent ages writing and debugging that code, and it now works!
In an unusually boring entry for YouTube, even by YouTube's low standards, the FIREBall team has posted a video of the guide camera data that shows that the balloon-borne instrument did not have stable pointing. Roweis ripped the video to PNGs, Lang fired them all into Astrometry.net and annotated them, and posted a video response! This is the first use of our technology to put meta-data onto video content.
In other news, I spent a very nice Friday in Halifax, where I gave my open-source sky survey talk, and caught up with Marcin Sawicki, who I got to know when we did the blind test of photometric redshifts.
In my project to match GALEX and SDSS sources, I am modeling the
distribution of all close coincidences (on the sky) between GALEX and
SDSS catalog entries as a sum of two distributions, one for false
matches and one for true matches. These two distributions are
differently constrained; for example in the false match distribution
the GALEX and SDSS properties should be
separable, and the
angular distribution should be consistent with little or no
Unfortunately, as I mentioned a few days ago, the true match distribution has the freedom, in what I am currently doing, to mimic the false distribution (though the false match distribution does not have the freedom to mimic the true distribution). I experimented with hacky ways to deal with this today, and decided that I must figure out and take a principled approach.
I got most of the method section written for the GALEX–SDSS paper, and also verified that my hacked-up expectation-maximization optimization methodology always increases the objective function (that's good!).