I finally got the response to referee completed on the Masjedi et al paper on the mass growth of LRGs through merging (major and minor). This task took ages, despite a very straightforward and constructive referee report. I celebrated.
Barron, LeCun, and I complexified our project on modeling galaxy images with three-dimensional galaxy models by considering the case of absorbing dust. We decided that we either have to severely restrict the possible dust geometries or else move to a full
ray tracing or computer graphics approach. I prefer the latter, but Barron is (rightly) concerned about speed. The long-term goal of the project is for the computer to simultaneously classify and model all galaxies, and also choose optimal or natural, data-driven parameterizations of the model space.
Zolotov and I spent some time discussing ways to investigate the dark matter distribution and assembly history for the Milky Way using observations of stars. The approach, with Willman, is to perform observational experiments on simulations of Milky-Way-like galaxies taken from cosmological simulations, in the hope of finding connections between observables and fundamentals that are robust and useful. In the short term, my small job in this project is to repeat, on the simulations, the analysis performed by Eric Bell (and us) in our paper on quantifying substructure in the Milky Way halo, but with the advantage that unlike with real observations, I know exactly the distribution of stars and dark matter in six-dimensional phase space.
At lunch I described Bovy, Moustakas, and my tentative detection of dust absorption in galaxy clusters using background luminous red galaxies. In the afternoon, Barron and I discussed how to flexibly include dust in our three-dimensional models of galaxies.
Art Congdon (Rutgers) gave a nice group-meeting talk in which he showed analytical approaches to understanding the effect of substructure on image magnification ratios and time delays. It is important to have an analytical structure based on perturbation theory, because otherwise you never know whether you are arguing for substructure on the basis of an incompete search of model space.
Raphael Bousso (Berkeley) surprised me in the Physics Colloquium by saying some things about the anthropic principle that were not wrong. This is surprisingly rare. My principal objection to the principle is that there is no
functional we can apply to a theory and determine the existence of or density of
observers. I have secondary objections relating to whether it is observers we mean at all! But most talks we have had at NYU make an even more basic mistake, which is that of not clearly distinguising the need for observers from the observational matter that there are observers of the human type in our Universe. If you use the anthropic principle, ask yourself this: How does your
anthropic constraint on your theory differ from an observational constraint that the observed Universe contains, say, galaxies or structure or carbon or metals or stars or people?
The fact that the Universe contains observers of our type is an observation, not a principle. The fact that any observed Universe must contain observers may qualify as a principle.
In related news, Bousso also gave a very nice argument about the cosmological constant problem, which I had not appreciated. You might think that lambda (plus vacuum energy corrections) gets set to exactly zero at the big bang by some requirement on cosmological initial conditions. That's a good idea! But then at the electroweak phase transition, the vacuum energy density drops by an amount that is some 50 or 60 orders of magnitude larger than the currently observed value for lambda. So if it is a fine-tuning that is performed by the Universe, it must be fine-tuned before electroweak to a value that makes it very close to zero after electroweak. Nice demolition, that! He had various other good arguments for the problem being a very serious problem, and proceeded to use these issues to motivate an anthropic selection from the string landscape.
I spent the afternoon in-between meetings trying to plan my sabbatical next semester. It is important that the entire time go to research.
With the end of the semester, it has been two relatively research-free days, I am embarrassed to report. I did spend some time talking with LeCun and Barron about our galaxy modeling project. Blanton and I had some discussions with the deans and provost's office and sponsored programs about meeting the obligations incurred as we join SDSS-3. And Cedric Deffayet (Paris) and Spencer Chang (NYU) confirmed my suspicions that the
Casimir Effect has a conventional explanation that does not involve vacuum energy density.
In a miracle of sorts, the good people at NASA have (partially) funded the Astrometry.net project for the purposes of creating a multi-wavelength catalog out of the data taken by various NASA missions. I spent some time today figuring out how this project meshes with the short-term goals of the Astrometry.net project and working out the first steps.
Dan McIntosh (UMass) gave a nice group meeting talk on galaxy mergers in SDSS as a function of mass, mass ratio, and environment. During the talk I think we all realized that measures of the merger rate are in some sense more precise measures of galaxy evolution than measures of the differences between different redshifts, but they are not necessarily more accurate because they involve substantial uncertainties in their interpretation.
Finally I returned to my summer project of distinguishing brown dwarfs from high-redshift quasars via proper motions. The faithful reader will recall that the method is to use proper motions determined in multi-epoch imaging in which the source of interest is not significantly detected at any single epoch. I spent the morning remembering what I had done and re-running the code on new data provided by Jester (many months ago).
Today Barron demonstrated to me a prototype system that can take an archival scanned plate (even of poor quality) and figure out the photographic emulsion with which it was taken. He uses the brightness ranking of the sources in the image, which is not completely trivial given the strong saturation in scanned photographic plates. We are close to our evil plan of being able to reconstruct all calibration meta-data for any image in any state of archival laxity.
I re-wrote the part of Masjedi's recent paper in which we describe how we integrated the correlation functions, because it turns out that we didn't describe it quite correctly. We did the right thing in the presence of possible software issues at very small scales; but certainly not what you would do if you weren't worried about the software.
Barron and I spent the afternoon discussing his project to measure the dates at which images were taken. To do the best job, he must measure the best centroids; up to now he has been using Robert Lupton's (Princeton) algorithm implemented in the SDSS image processing code. This algorithm fits a set of one-dimensional parabolae; it is a fast numerical approximation to fitting the peak of each source with a two-dimensional parabolic surface. Jon's new centroiding code actually does the full-up parabolic surface fit to the peak of each source, and it appears to return centers that are better than the Lupton code, although at the (significant) expense of compute time.
I spent the day at CITA today. The morning was spent with talks on inflation, and they nearly convinced me that it might make some testable predictions! In the afternoon, Rocky Kolb (Chicago) and Simon White (MPA) faced off in the
Dark Energy Smackdown. In the end there was no smacking, because they agreed. But I think they were being too conciliatory: Although we all agree that both large dark energy projects and chaotic astronomy are good parts of our future, I do think there are deep disagreements about how to balance these and how to protect ourselves from the secretive culture of high-energy experiment.
I spent the day at Perimeter, where there was a full schedule of talks about cosmological observations. There were many highlights; indeed all the talks were great. In particular, Olivier Doré (CITA) showed that simple inflation-inspired non-gaussianity can lead to very strong scale-dependent bias, even at large scales where we think that locality ensures linear bias. This may provide new ways to demonstrate gaussianity in the near term with galaxy surveys.
Gil Holder (McGill) told us about the brave new world of CMB, now that the primary anisotropy is nailed. He even gave a conceivable methodology for working out the z=1100 initial conditions for a large volume of the Universe using cross-correlations of CMB polarization and 21cm emission.
Dan Holz (LANL) showed us how black hole mergers are unprecedented
standard sirens for cosmology.
Mike Kesden (CITA) used simple symmetries including parity and exchange symmetries to put constraints on the possible outcomes of (no-hair) black-hole—black-hole collisions; he has produced some wonderful results with symmetries alone. He can predict the outcomes of numerical simulations (which are ridiculously expensive). Bravo!
I spent the morning closing tickets, opening tickets, adjusting web pages, and generally tidying and cleaning up in Astrometry.net. There is lots more to do. We have had huge numbers of requests for code and web access after we appeared on space.com (disclaimer: I never said that I was the
leader of this project).
Today I attended an excellent candidacy exam by Ronnie Jansson (NYU), who is interested in reconstructing a three-dimensional model of the magnetic field of the Milky Way using Faraday Rotation, synchrotron radiation, stellar polarization, and Zeeman splitting.
I hate to mention galaxy classification here, because that phrase is loaded, and most astronomers associate classification with what I would call
classical morphological classification, which I believe has been a very unproductive method in astronomy (despite its incredibly successful adoption and widespread use). But that polemic is the subject of another post!
Today, Yann LeCun (NYU), Barron, and I discussed the possibility of performing an objective galaxy classification not in the space of 2-d galaxy images, but in a 3-d
shape space, with the 2-d images constraining the shapes and setting the diversity of 3-d types. This is like an inverse of the computer graphics problem, that is, it is not
here are the 3-d objects, what is the scene? but rather
here is the scene, what are the 3-d objects? The key to solving the problem is the hypothesis that the great diversity of galaxies emerges from a small number of (parameterized) types, plus viewing peculiarities. Will it work? We shall see.
Barron and I spent part of the afternoon discussing various approximations by which we can estimate the uncertainty in a measure of a star's centroid. In principle this is easy, when the centroid is some kind of first moment. However, astronomers have learned that first moments do not deliver the most accurate centroids. The best centroids come from the fitting of paraboloidal
tops to star
peaks in the image, and the errors are determined by propagation through the fit. However, this propagation is non-trivial to perform correctly; it is even non-trivial to perform it approximately!
Michael Joyce (Paris) gave a great group-meeting talk in which he showed that he can perform exact analyses of discrete N-body simulations for finite particle density, in the linear regime, and explain the output of numerical codes. He focused on differences between numerical simulations and the
real world that arise from discreteness. All this matters for precision cosmology.
In the afternoon, Ed Bertschinger (MIT) used the nice property that the growth of fluctuations in the Universe can be calculated in a picture in which you treat each (sufficiently large) fluctuation as its own little FRW Universe to calculate growth of structure in some non-Einstein gravity theories. He showed some beautiful exact solutions in a suite of departures from GR.
In the early morning I worked on my tesselations for our Spitzer proposal. I had to reverse engineer the IRS peak-up star flux criteria, because I have to make my observation plans automatically with code, not manually with the (nice) observation planning tool.
I spent the day at Astrophysics 2020 at STScI. I was inserted into the extrasolar planet session, which was not totally appropriate, but I was cool with it, because I learned a huge amount. In particular, I learned that a lot of the figures of merit for extrasolar Earth-like planet detection go like the telescope diameter to the fourth power, and we are talking about telescopes in space, of course! But I also learned that there are dozens of transiting planets, some of which have both primary and secondary eclipses measured, and some of which are only tens of Earth masses in size. After the meeting I have no doubt that we are going to find Earth-like planets soon; I agree with the speakers that it is imperative that it be a top NASA priority moving forward.
I read a few papers today on the build-up of galaxy masses through mergers. The (very constructive) referee on our recent Masjedi et al paper pointed us at some new papers on the subject that find very small amounts of evolution since redshifts around unity. Given cosmic variance and sample sizes, plus uncertainties in K and evolution corrections, nothing is nearly as precise as our direct measure of the accretion rate, so they only confirm our result at the order-of-magnitude level.
My respite from NSF proposals was a nice seminar by Tom Duke (UCL) about how the ear hears. He emphasized the amazing range of the ear: a factor of 1000 in frequency, and a factor of 1012 in loudness. The faintest sounds are heard just at the thermal limit, ie, when the energy per cycle is about kT. He showed that the
hair cells that do the heavy lifting are very dissipative objects but driven nonlinearly so as to act like incredibly sensitive oscillators, and that they transmit their information to the auditory nerve in the form of pulse distributions, or really the time delay distribution of delays between pulses. Incredible, all around.
I have been breaking my rules about posting! This is just to protect my reader from the terrifying fact that I have spent every spare second working on my NSF proposals. The only exception was Big Data Lunch, an event I co-organized at NYU, in which Rob Fergus (CS) gave a short seminar about his attempt to organize and classify a set of 80 million images harvested from the web. His results are impressive: He has so many images that he can use brute-force techniques like least-squares to find similar images! The discussion, among about 30 people from around the University, was very lively.
I am at the SDSS meeting at Fermilab. I spoke about how we are going about looking at the huge galaxies—galaxies comparable to or larger than an individual SDSS field. We have to build mosaics; building mosaics is non-trivial if you want to preserve all the information in the data. For me, this is part of our Gunn Atlas project.
There were incredibly impressive talks from the Supernovae team (working with the multiple epochs in the SDSS Southern Stripe that my reader knows well). They are taking systematics much more seiously than I have ever seen before. This is a good thing about getting particle experimentalists involved in precision cosmology. The scariest systematic (presented by Kessler of Fermilab) was a near-degeneracy between dust absorption, the intrinsic scatter in supernovae colors, and the world model.
Related to that, Holtzman (NMSU) gave a remarkable talk about how they do the time-variable supernova photometry; it involves making a complete fit to all of the pixels in all of the images at all epochs! This is not unlike the proper-motion stuff I was doing this summer, but much, much more sophisticated.
There were remarkable talks all day; two others that stood out were about clear detections of the MgII-absorber–galaxy cross-correlation function by Nestor (Cambridge) and the scale and success and potential of the galaxyzoo.com project by Thomas (Portsmouth). This latter project has some similarities in spirit to Astrometry.net, although they have much less technology under the hood.
Barron and I spent the morning arguing about his project to automatically determine the date at which an image is taken, using only the information in the image pixels themselves. Most people (okay, not most
people, but maybe most astronomers) laugh when you even suggest this, but it turns out that for typical historical plates, there often is enough information in the moving stars in the image to pinpoint the date to within years. It is straightforward to define a scalar to optimize; the optimum gives you the date. It is non-trivial to determine the uncertainty on that estimate of the date. By the end of the day, Barron had a promising estimate of the error bar. Time to write!
I spent the evening frantically preparing materials to talk about huge galaxies in the Sloan Digital Sky Survey at the collaboration meeting at Fermilab tomorrow. Thanks, Blanton!
Yesterday and today, Barron and I worked on responding to referee on the USNO-B paper. We resubmitted. It was a pleasure to update the paper according to constructive referee comments. The paper is improved.
Barron and I discussed several other
blind astronomy image meta-data projects including: We have shown some success in determining the date at which an image was taken, using the USNO-B-tabulated proper motions of the stars in the image. We are writing that up. We believe we can also determine the bandpass of an image, using the brightness ranking of USNO-B stars of different colors.
In related news, Roweis, Yann LeCun (NYU), Barron, and I may start a new project to perform morphological classification of galaxies. Both of my readers will be stunned, because they know that I am adamantly against morphological classification! However, we have a very new idea, which is to build models of galaxies in three dimensions, and try to explain all galaxy image data with a small number of three-dimensional models plus Euler angles. That is, the project is to create a fundamentally three-dimensional generative model of all images of galaxies. That would be new, and not be subject to my usual wrath.
Ryan Scranton (Google) was in town this morning, and he, Blanton, and I talked about how to make enormous amounts of high dynamic range imaging viewable by a human with a normal monitor. Think about viewing the whole terapixel SDSS dataset on a normal desktop monitor. This requires panning and zooming, of course. I think we all ended up agreeing that it also requires different stretches at different "zoom levels" in order for the imaging to be fully exploited. We hope to demo some ideas soon.
In the afternoon I worked on my ADASS proceedings and some proposals.
Daniel Eisenstein (Arizona) was in for the day and we discussed SDSS-III, a project to follow the current phase of the Sloan Digital Sky Survey. NYU's main interest in SDSS-III is its project to measure the baryon acoustic feature at redshifts of 2/3 and 2.5 or so using luminous red galaxies and quasar absorption lines respectively. We are intending to contribute resources to the project in exchange for institutional participation and we will have some responsibility for the data analysis, data serving and sharing, archiving, and preservation. We spent most of our time working on what is needed for planning and budgeting, so it wasn't exactly a scientific morning. But that's research too, I guess!
In the afternoon I had a short but useful conversation with my colleague Scoccimarro about how best to measure the BAF given nonlinear evolution in the growth of structure at the scales of interest. He and Eisenstein disagree sharply on how to do this; I am confused because they are my two gurus on these subjects.
I spent a great day at the Institute for Advanced Study in Princeton. Peebles and I worked on our critical review of observations relating to CDM on galaxy scales in a borrowed office. We were interrupted by a great talk by Alice Shapley (Princeton) on gas-phase metallicities as a function of redshift, including all the complex differences between star-formation regions at high redshift and low. We were also interrupted by the inimitable Tuesday Lunch (now Bahcall Lunch) which continues in the traditional way. I spoke about Astrometry.net, which generated a surprising amount of questions given that most of the audience is concerned with much more theoretical matters.
Barron and I continued a multi-day conversation about how to use the 2MASS Catalog to verify and analyze our false negative rate on our work on removing spurious sources from the USNO-B Catalog. It is not trivial, because many of the spurious sources are very close to non-spurious sources, which often match at 2MASS entry.
Gruzinov, Berlind (visiting from Vanderbilt), and I spent some time discussing how to use statistical correlations to associate sources across very disparate bandpasses. I would say more, but the result in question is under embargo. I don't like it when the private journals and collaborations embargo results. Aren't we all scientists, living in a world of ideas? Who ever benefited from a scientific embargo? I bet it wasn't a scientist!
After a very thought-provoking Computer Science Colloquium about Ask.com, I had a lunch meeting with NYU's Rob Fergus in Computer Science, with whom I have significant research overlap. He and I discussed image combination and recognition of information in images, fields in which both of us currently work (with very different angles).
In the afternoon, Louie Strigari (Irvine) gave a nice astro seminar on using very small objects, and dark-matter properties on small scales, to test the fundamental properties of the dark sector. He argued that current data on the latter are not constraining, but that the dwarf galaxies are potentially very interesting.
Wu and I went up to Columbia to meet with Schiminovich and his group to discuss a possible spectroscopic observing proposal for the final cryogenic observing cycle of the Spitzer Space Telescope. The meeting was productive and we all have marching orders.
After my triumphant blog post yesterday, when I thought my research day was over, I went to a great talk by Bruce Knuteson (MIT) who spoke about extremely general data analysis plans for finding departures from the standard model in LEP and LHC data. It relates to a number of things we have been thinking about with astronomical data.
The gods are smiling, because I got two friendly, constructive referee reports in the same week. When's the last time that happened? I spent time this morning working on the report on Masjedi et al; it is only 07:30 and I have already put in some solid research today!
Barron showed up today and we worked on getting the USNO-B paper ready for resubmission, after a very prompt (thanks, anonymous referee!) referee report from the AJ. We also discussed his system for determination of image dates, which involves precise astrometric calibration to a set of minutely different astrometric catalogs (each set to a different epoch), and decided that we should write it up now, with the thought that writing it up will focus the issues.
Mierle and I spent a few minutes discussing astrometric
tweak at the end of our new, weekly Astrometry.net telecon; getting the output of Astrometry.net to be precise enough for use—unmodified—by scientific users is at the top of our priorities for the coming months. Mierle and I are convinced that a ransac-like approach will work, though neither of us has demonstrated this with functional code. Roweis and Lang are not convinced. Mierle put a bunch of tickets in our trac system to help get a ransac testbed working.
Barron demonstrated that he can determine approximately the date an image was taken, using the proper motions of the moving stars. His test was a real image of the Beehive cluster we found on the web. This is great: Not only can we calibrate the astrometry of your image based on the image pixels alone, we can also calibrate the clock! The precision is going to depend on the field size and the proper motion distribution that happens to exist in the field of view.
Group meeting was Blanton telling us what he learned in Spain about massive spectroscopy projects underway in the next few years. The Astro seminar was Alison Farmer (Harvard) describing a plasma-physics solution to some oddities about Saturn's rotation and radio emission.
Albion Lawrence (Brandeis) gave the high energy seminar on the
meaning of the 10 dimensions of string theory. It seems that the true dimensionality is debatable, because it depends how you determine the dimensionality of spacetime, empirically. When dimensions are large or infinite it is easy, but small, finite, wrapped dimensions can appear as dimensions of spacetime or as extra particle states or as non-trivial topology in manifolds of lower dimensionality. Insane!
I barely put pen to paper—and when I did it was on merging galaxies—but a lot happened today. At group meeting, Zolotov spoke about her project to measure the properties of dark-matter halos using stellar tracers, either stellar tracers of the potential or of the configuration. Wu spoke about her project to look at the star-formation histories of galaxies as a function of their group environment. Both of these projects have a lot of different places to go.
The Computer Science Colloquium was by Ronen Basri (Weizmann), speaking about fast lookup of nearest neighbors in databases, but where the data objects are linear subspaces and the queries are points. His talk included some great example problems, but also some beautiful linear algebra. I spent some time after the talk thinking about the linear subspace issue; it is a great one. I don't think it has anything to do with Astrometry.net. After the talk I spoke with NYU CS's Yann LeCun and new faculty member Rob Fergus, both of whom have interests that partially overlap mine.
The Astro Seminar was by Bob Rutledge (McGill) and on the physical properties of neutron stars. By improving the modeling of their emission, he has greatly improved measurements of the mass–radius relation and can plausibly rule out some reasonable equations of state. If Constellation X never flies, Rutledge may have the last word on the subject!
I am not sure that
prioritizing is a word, but I spent most of my research time today prioritizing the near-term activities of Astrometry.net. We need to get funding, and we need to have a clear research focus for those funding proposals. We need to interface with various viewer and archive and distributed computing systems. We need to work through a whole bunch of new data. And we need to facilitate new science and data discovery.
I attended a very nice seminar today by Aaron Chou (NYU) about his work on an experiment looking for oscillations between photons and dark particles like axions or equivalent. He has a null result, which rules out some experimental claims. I was interested in the work because it might be superseded by astronomical measurements of transparency, one of which I am in the process of making right now.
Gorski (JPL) gave a nice talk on the Planck mission, drawing particular attention to comparisons with COBE and WMAP in design, survey strategy, and data analysis. He showed surprising evidence for anisotropy in the WMAP maps, although certainly did not advocate that our Universe is anisotropic! He has been working on high-end data analysis for the Planck mission with supercomputing facilities.
During the rest of the day, I generated a lot of tickets (bug reports) on the Astrometry.net management system related to funding (grrr) and our upcoming USNO-B
clean data release. Lang and I spent a short phonecon with the VAMP team regarding meta data standards. I also discussed observational handles on the dark-matter component of the Milky Way with Zolotov.
To violate my rules (see column at right), I will say that I spent Wednesday on teaching and administrivia, Thursday on an unplanned vacation, and the weekend and today on grants administration! I have almost forgotten what astrophysics is about!
On the other hand, on Friday I spent a beautiful day in Princeton, where there was a lovely memorial and reception in memory of Bodhan Paczynski, who I admired not just for his astrophysics contributions but for all sorts of reasons that don't count as research. I realized that Astrometry.net is nicely aligned with Paczynski's calls for opening up new observational capabilities, especially those that might lead to serendipitous discovery and time-domain astrophysics (lensing, supernovae, asteroids, etc.).
Before the memorial, Peebles and I spent a few hours arguing about ways to use the stellar and baryonic components of galaxies to infer the fundamental properties of the dark-matter concentrations in which they reside, as those fundamental properties are predicted by the CDM model. We spoke much and accomplished little, but it gave our writing on the subject a kick in the pants.
I also attended a Princeton gravity group talk by Battefeld (Princeton) on the possibility that primordial magnetic fields in galaxies arise from cosmic string interactions with cosmic plasma. Insane? You might think so, but in fact there are very few mechanisms out there that have a good shot of working to make the seeds that galactic dynamos could amplify to the observed levels at the present day.
Some beautiful visualization talks were the highlight for me of day 2. I am suspicious about visualization for creating scientific results, although I think good visualization is essential in vetting data and procedures. In the former I think I differ from most of the ADASS crowd.
Speaking of which, we were strongly advised to interface Astrometry.net with World Wide Telescope (when Microsoft releases the product and an API) and the VirGO interface (which is built on Stellarium, a nice open-source planetarium). I hope we will do both in short order, along with Google Sky.
I gave my talk at the ADASS meeting today. I was followed by Warren Hack (STScI), who demonstrated a nice, robust, hybrid scheme for aligning overlapping astronomical images precisely, even when those images contain very few (or no!) point sources, time-variable and moving sources, and cosmic rays. It could be a great back end for Astrometry.net.
Before Hack and me, there were talks about data preservation and archives, which emphasized the complexities and difficulties of preserving not just the data and meta data but an understanding of what it all means, which is currently preserved in people's heads. Bob Hanisch (STScI) gave a great talk about the VO which went beyond—and made specific—some ideas I have been floating around in one of my polemics, involving the relationships among data and code and the papers based on those data and that code. He agreed with me that incremental adoption of this kind of radical association will be fraught with cultural and political difficulties.
Today was the reception and pre-meeting workshops for the annual ADASS meeting. The crowd is a good one because we can fully geek out and not feel like we are being too technical! I spent the day working on my viewgraphs for my presentation (on Astrometry.net, of course). At the reception I spent a lot of time talking with Emmanuel Bertin and the Terapix group. We have a lot in common. The Terapix group has an incredible track record of producing software that fills a need and works.
Two nice seminars today: The first was Jonathan Zrake (NYU) talking about Chandra observations of Farrar's putative UHECR source to identify possible interesting candidate objects. The second was Juna Kollmeier (OCIW) talking about using measurements of the IGM to constrain the feeding of galaxies with gas and thereby make the link between the simple physics of the early universe and the complex end processes of galaxy formation.
Inspired by Sheldon's submission of this paper, I did literature research today on the use of statistical (averaged) weak lensing in constraining fundamental parameters of the dark sector. I decided that statistical weak lensing has shown us that dark matter halos exist around galaxies and groups and clusters with the masses, sizes, and radial profiles one would expect, at least on average. It is not a precise tool—because of the averaging—but it does bolster the dominant paradigm.
My research time this weekend was spent closely reading and commenting on Maller's draft of a paper about empirical determination of inclination corrections, or the effects of internal absorption in galaxies by dust that can be detected by observing similar galaxies at different inclinations to the line of sight. Maller's approach is unusual, because he simply tries to find the corrections such that the distribution of corrected galaxy properties (and more than just the spectrum can be affected) is independent of axis ratio (the observable most closely related to inclination).
By the time Barron left today, we had accomplished all of our goals: We submitted the paper to AJ (it will appear as arXiv:0709.2358 on Monday; it is embargoed until then), we worked out some new computer-sciencey projects, and we worked out some new astronomy-ey projects. No time to celebrate, back to the grind!
Barron and I played around with some ideas in two-dimensional density estimation with the idea being that we might be able to identify some of the spurious entries in the USNO-B Catalog that are not due to diffraction spikes and reflection halos. Barron coded up two ideas and both looked like they worked well. We aren't going to hold up our paper on USNO-B for this enhancement, but we might insert it into the Astrometry.net pipeline (which benefits enormously from a very clean USNO-B Catalog).
Barron is in town and we worked on finishing up the USNO-B cleaning paper. We also had a phone conversation with Roweis, Lang, and Mierle about the long-term goals of astrometry.net and how the short-term activities might take us there.
I spent some time discussing the imaging in Google Sky with the Sky team and with Blanton and Sheldon, because the Sky team would like us to contribute some substantially-nicer SDSS imaging than they already have. I'm pleased—we do have the best images—but I don't relish the work it might involve. We have also been discussing connections between astrometry.net ans Sky. In the terrifying future, astrometry.net could drive traffic to Sky and Sky could allow browsing and visualization of astrometry.net's results. That would be fun.
Lars Bildsten (UCSB) was at NYU today, so much of the day was spent talking about nuclear astrophysics, including supernovae and the structure and cooling of white dwarfs. In his talk, Bildsten showed good empirical evidence that there are at least two different types of type Ia supernovae; this goes some way towards resolving a long-standing discrepancy between the observed Ia rate and iron abundance in galaxy clusters, and also the evolution in rate density as a function of redshift.
Moustakas and I sat down with Jo Bovy (NYU) to discuss his short-term project of measuring the transparency of galaxy clusters. We are interested in two things: (1) can we measure the amount of dust in the clusters, and does it make sense given the plasma temperature and the metallicity, and (2) can we rule out a class of naive axion models for the dark matter, which generically cause wavelength-dependent transparency variations in regions of high magnetic field? This all might sound crazy, but there hasn't been much work on the transparency of the Universe since the 80s, and the data have gotten much better. I expect us to put very strong limits and rule out some theories.
Or win the Nobel Prize!
I spent a bit of time improving my treatment of the SDSS point-spread function in my project on faint-source proper motions. Because I am working at the faint, low signal-to-noise end, I don't need my PSF to be perfect. But I am making it less of a hack.
I spent some time researching the novelty of my faint-source proper motion work. The idea is so simple, I am sure it is not new. I couldn't find—even with help from the experts—any published work that measures proper motions below the individual-epoch detection limits. But I am not done: I only checked the stars literature and not the Solar System literature. It may be standard operating procedure for those looking for new Solar System objects.
I rearranged what I have written on dark-matter halos for Peebles's and my review of galaxies in CDM. I also worked on the
conditions of use for the Astrometry.net data files (indices). These indices require conditions because they are built from data with conditions. Also we want to make sure that our code users play nice.
One of the things we like to brag about with astrometry.net is that it can be used to recover data that have been lost or badly archived (that is, have been tagged with incorrect meta-data). We recovered SDSS run 2301, which is an engineering run of no particular import except that the astrometry for the run was never correctly determined (probably because the telescope was confused about where it was pointed at the start of the run). Today, some SDSS collaborators contacted me about SDSS run 6895, which goes through globular cluster M71. I spent the few research minutes I got today trying to massage the data into astrometry.net.
Note added next morning: When I got the first field of the SDSS data into JPEG form and got it into the system, it solved fast, and our system even said that the field contains M71.
Red stars are our object-detections and green stars are in the USNO-B. M71 is in the upper right corner.
My loyal readers may have noticed that I have spent most of the last month writing rather than doing. It is annoying, but it ain't science if it ain't published! I finished a first draft of the results section of the USNO-B Catalog cleaning paper, making a complete first draft of the whole paper for the first time. Barron will visit in September for us to finish and submit it!
I encouraged Zolotov to go with a test-driven development style in her work on the shape of the Milky Way and simulated galaxy stellar halos. When it is important to be not wrong, test-driven is very good. I have learned this the hard way!
At group meeting, Eyal Kazin (NYU) told us about measurements of the redshift-space, projected two-dimensional, and comoving-space correlation functions of LRGs in the current SDSS sample. He has a beautiful detection of the baryon acoustic feature even in the projected function! This means that it has great signal-to-noise in the overall data. After Kazin, I spoke about my statistical work on proper motions.
In the afternoon, much work was done (largely by email) on the language and writing of the USNO-B cleaning paper among Lang, Barron, and me. We made good progress on language and presentation issues.
I finished the Masjedi, Hogg, and Blanton paper on the accretion rate for luminous red galaxies and submitted it to the Astrophysical Journal. It has a nice punchline: The accretion rate, even when integrating over a factor of 50 in merger mass ratios, is less than a few percent per Gyr, substantially lower than that predicted in CDM simulations, and much lower than many morphological measures of the merger or accretion rate. It will appear tomorrow as arXiv:0708.3240.
With funding proposals and the beginning of term looming, I didn't get much time for research yesterday and today, but what I got was spent working through the literature on galaxy merging for the LRG accretion rate paper, which is so close to being done!
research I did on my week-long vacation off the grid was to draft a section for Peebles's and my synthesis of galaxy evolution in the cold-dark-matter paradigm. The section I drafted was on the predictions and observations of the dark-matter
halos (or concentrations, virialized or otherwise) around galaxies and groups of galaxies. I think that weak and strong lensing, along with sensitive measurements of intragroup light, might put serious constraints on the cold dark matter model at small scales, because, depending on some issues of how baryons populate the halos, each of these provides some handle on the distribution of dark matter—radial distribution, large-scale anisotropy, and small-scale substructure—which is (apparently) a robust prediction of the model.
I spent time yesterday (traveling for family reasons) and today working on incorporating non-trivial point-spread-function estimates into the proper-motion measurement code. I fear that the gaussian approximation to the point-spread function is affecting my results, at least slightly.
I also spent some time reading and commenting on a very nice draft paper by Sheldon on the mass-to-light ratios (as a function of scale) of virialized structures in the Universe, based on a comparison of cluster–galaxy cross-correlations and cluster–mass cross-correlations (based on statistical weak lensing).
My loyal readers will both be sad to learn that I go out of all internet contact for a week starting Monday, so there may not be much here.
I spent the morning re-reading papers about merger rates, to improve the discussion in our merger-rate estimate paper. I was reminded of how stark the difference is between studies that estimate the merger rate morphologically (ie, by identifying probable merging galaxies by tidal features or asymmetries in their appearances) and studes that estimate it by close pairs. The morphological studies all get much higher rates, which says to me that either morphological asymmetries are raised by very small accretion events, or else that they last many dynamical times. The close-pair rates have the virtue that—if done correctly—they can provide a strict upper limit on the merger rate, so the fact that the close-pair studies come in lower means that the morphological studies are biased high.
In among many non-research activities today, I managed to have a long and useful argument with Blanton and visiting student Abate (UCL) about constrained realizations, the local velocity field, and velocity correlation functions. I also spent some time measuring the proper motions of objects in the SDSS Southern Stripe with known proper motions.
Stop the presses! No, seriously, I showed today that I can measure the faint, very red sources in the SDSS Southern Stripe—sources so red that they must either be brown dwarfs or high-redshift quasars—with enough accuracy that I can separate them with some reliability on the basis of proper motion alone. The separation is not perfect, because there are some faint brown dwarfs with very small proper motions. However, the most widely separated brown dwarfs are also the most interesting, because they have the highest probabilities of being very low luminosity and therefore old and cold.
In other news, Lang, Mierle, and Roweis worked all night and got the first astrometry.net paper submitted.
I spent most of my time yesterday (forgot to post!) and today continuing to add functionality to and debug the proper motion code. I decided that debugging requires that I have code to visually compare the stack of a set of epochs at zero proper motion and at some candidate non-zero proper motion, to understand, functionally, why I get some (clearly wrong) proper motions for some sources in real data, while not in the artificial data. I expect neighboring objects and noise issues (eg, underestimated noise) are affecting me. On the former (neighbors), the right thing to do is to simultaneously fit for all sources in the field!
The time not spent debugging was spent helping Lang, Mierle, Blanton, and Roweis finish the astrometry.net submission to Science. They are intent on submitting today. It is a lofty goal, and perhaps an achievable one.
I performed a massive re-organization of the faint-source proper-motion code, to make it less redundant and easier to test and analyze. I improved the use of the SDSS data, though I have yet to build in good estimates of the individual-field point-spread functions. Once that is done, I can try to reproduce some known proper motions in the SDSS data and write it up.
Yesterday the high-z quasar team on SDSS released to the collaboration six new quasars in the SDSS Southern Stripe. I completed my proper-motion code and used it to measure their proper motions. I find that the quasars have no proper motions at my precision! It remains to be seen if I have the precision to separate them from brown dwarfs at the same magnitude.
I pushed my faint-source proper motion measurement experiments to the signal-to-noise limits. Indeed, the system breaks down not when the signal-to-noise per epoch is low, but when the combined signal-to-noise in the data from all epochs is low. It works all the way down to a signal-to-noise around 4 in the combined data set, even when the per-image signal-to-noise is much less than 1.
I tried to figure out how accurately one can possibly measure a proper motion, given a heterogeneous data set. I have an approximate expression, and it seems to agree with my code (ie, my code seems to hew close to the best possible errors), but I don't yet have a proof. One interesting consequence of this exercise is that the proper motion accuracy does not depend explicitly on the number of images you take, only their time span (measured appropriately) and the total signal-to-noise with which you have detected the source. For this reason, if you are concerned to beat down systematics with redundancy, go to town: Take many short exposures at many epochs rather than a few long ones. Of course this is all provided that you are gutsy enough to measure the proper motions below the individual-epoch detection limits!
Eric Bell (MPIA) and I spent some time talking about making precise measures of the merger rate. This was inspired by my finishing up the Masjedi et al paper on the growth of LRGs from accretion of satellites. The main uncertainty, of course, is the merging time-scale. We use a time-scale based on dynamical friction, but that is only good to factor-of-two at best. Will we ever have a reliable statistical measure of the growth of galaxies by merging? Perhaps if we have models that build realistic galaxies in a cosmological context.
Bell and I also discussed the project I was discussing a while back with Darren Croton (Berkeley): If galaxies at the massive, red end of the red sequence grow by merging, then the color-magnitude relation should flatten out there. That is, you can't maintain a linear relationship between color and luminosity if galaxies are merging prodigiously. This project requires good photometry at the bright end. Right now, Blanton and I are among the few on SDSS who can provide.
On Friday I measured the proper motion of a high-redshift quasar, one that is not visible in any individual SDSS epoch, but visible in the stacked image, by fitting simultaneously all the pixels in all the individual images, and then marginalizing over the unknowns (flux and position). I got zero, to within the errors. Woo-hoo!
Continued work on my RANSAC and EM-based astrometry tweak code. It is proceeding by
test-driven development techniques, so what I did today was define the calling sequence and unit tests for the part that does the fitting of positions in the image to positions in the catalog.
In between work supporting our alpha testers and writing other people's papers, I thought a bit more about detecting faint source proper motions. There are really four regimes: slow, in which stars move less than the PSF over the duration of the survey (total epoch range); medium, in which stars move substantially more than the PSF, but less than the size of individual images; fast, in which the sources move a distance comparable to the size of any individual epoch image between epochs, and many times this over the duration of the survey; and streaked, in which the sources make trails even in individual images, and probably have orbits that are non-trivially curved over the duration of the survey. Each of these regimes requires different statistical equipment and techniques for optimal detection and measurement. I realized I need to re-read papers by Lepine.
In the role of Rix and Jestser's graduate student, which I have taken on for the summer, Jester assigned me the project of obtaining the proper motion of UKIDSS z=5.86 QSO ULAS J020332.38+001229.2, which is not visible in any individual SDSS epoch, but is visible as a z-band-only object in the combined image from many epochs, as it is in the SDSS Southern Stripe.
Today was the first day of Galaxy Growth in a Dark Universe here in Heidelberg. I could only go to part of the day, but I did see convincing evidence from Kriek and Kodama that there is some kind of
red sequence in place at redshifts 2<z<3. Of course the
old galaxies at these epochs in fact look like K+A or post-starburst galaxies!
Our web service astrometry.net works by rapidly generating large numbers of hypotheses and then attempting to verify them using a statistical test; as soon as one generated hypothesis verifies, its job is done. I spent the day working on a similar methodology for detecting—and measuring the proper motions of—extremely faint sources in multi-epoch astronomical imaging.
Wu and I had a long chat today with Frank van den Bosch (MPIA), who is working on galaxy environments with his high-test group catalog constructed from the SDSS data. His results are all in general agreement with ours, including especially that all environment effects come in at the group scale or smaller, and that a lot of what is happening is driven by the differences between
satellite galaxies in the groups (where by these terms we usually mean
highest stellar mass and
lower stellar mass, observationally, since group centers are notoriously hard to determine).
What is new about van den Bosch's work is that he finds that almost all of the information in the environment relations—once you make the central–satellite split—comes from mass segregation among the satellites! The higher mass satellites are more concentrated towards the centers of the groups, and the more massive groups have more massive satellites. Since star-formation rate and everything else is strongly related to galaxy mass, these mass effects drive most of the action. This is remarkable, but not in conflict with our results, and the starting point for some new observational experiments.
Of course, like with our work, the van den Bosch results are easy to misinterpret, because they involve controlling for multiple variables and looking at dependences on remaining variables, which is a subtle business. Also, vdB showed that some of the environment effect conflicts in the literature come from the use of different simple scalar statistics to describe the complex variation of distribution functions.
This is all further motivation for writing a review of galaxy environments that assembles and synthesizes this!
I had a vision this morning, and followed that by figuring out a strategy for detecting, in multi-epoch imaging, sources that are below the detection limit at any individual epoch, and which are moving too fast to be detected in the stacked image, where all epochs have been combined. Jester describes this as
detecting and measuring objects that are not detectable.
This problem is harder than the problem of just measuring the proper motion of a source you already know to be there (mentioned in earlier posts), as when the proper motion is small (so the source is visible in the stacked image). Indeed, even the PanSTARRS people (to my knowledge) have no strategy for this hard (but important, given local brown dwarfs) problem. Most methods you can imagine involve testing enormous numbers of hypotheses, which clearly costs you signal-to-noise. My method—a variant on RANSAC—limits these combinatoric costs. Now to see if it works?
Discussed with Jester the issues of measuring and detecting faint sources in multi-epoch imaging, especially sources too faint to be detected well at any individual epoch. It seems that these problems are not currently solved, although they are relatively simple statistics problems.
Discussed with Romeel Davé (Arizona) some issues of stellar baryons and dark matter in massive collapsed objects. If massive objects (like clusters and groups of galaxies) have formed through many mergers, and if some of the major mergers follow star formation (as we think they do), then in general it is hard to avoid having the sum total stellar mass distribution (including galaxies and intragroup light) trace the dark matter (in azimuthal average). That's a serious prediction! Testing it may require sharpening some blunt tools. But Sheldon, Blanton, and I have started down this path for other reasons.
I spent time working out how we might challenge galaxy formation in CDM by considering normal galaxy halos. The simulations make several non-trivial predictions, although all are hard to test: They predict non-spherical halos, with substantial triaxiality. They predict small-scale structure including concentrated substructure in real space and also velocity space. They predict that central galaxies at the centers of halos have different formation histories and properties than satellites. Each of these predictions has different kinds of uncertainty associated with it, and relates to other predictions, so directly getting at fundamental CDM properties is challenging. But I think it may be possible with a combination of Milky Way and Milky Way halo kinematics and structure, weak lensing on distant galaxies, clustering studies and studies of groups, and models of galaxy merger and accretion events.
I also continued work on the Masjedi paper.
For Masjedi's paper I worked on the text on K-correcting the imaging subsamples. My loyal reader will recall that Masjedi's project involves cross-correlating the imaging galaxies fainter than LRGs with spectroscopic LRGs. In general, we don't know the redshifts of the imaging galaxies, but we do want a real-space correlation function. His K-correction method is very clever: We K-correct each imaging galaxy in each imaging–spectroscopic pair of galaxies using the spectroscopic galaxy's redshift. Then we K-correct differently when that same imaging galaxy appears in a pair with a different spectroscopic galaxy. It all comes out in the wash, because the excess pairs at small scales that survive the correlation analysis are all true pairs (statistically) and so we subtract away all the wrong K-corrections and are left with only the correct ones (statistically).
Did that make sense?
There are now several diffuse stellar overdensities (and at least one reported underdensity) in the stellar halo. By consensus, these provide evidence for the build-up of the halo by accretion of smaller objects. I am skeptical, but Rix and I discussed a possible project to analyze the kinematics of one of them—the Hercules-Aquila cloud, which does seem to have a small velocity dispersion.
I continued working on text in the Masjedi paper
Wrote figure captions in the Barron & Stumm paper, and annotated equations in the Masjedi paper.
Jester and I discussed the problem of the determination of proper motions of sources in multi-epoch imaging, even when the sources are too faint to be detected at any single epoch. It is a beautiful statistics problem, and highly relevant to the SDSS Southern Stripe (where there are many epochs), PanSTARRS, and LSST.
I spent today writing papers for others. Blanton and I have a strict rule for the group that no-one ever writes text for a paper on which he or she will not be first author. But I am violating this rule all summer, as I finish up the paper with Barron and Stumm on cleaning USNO-B and the paper by Masjedi et al on the accretion rate onto LRG primaries. Barron, Stumm, and Masjedi are all now working at Novartis, Microsoft, and Goldman Sachs, so writing astronomy papers is below their respective pay grades. I spent much of today working on the Masjedi paper, particularly the data section.
There was a lively discussion over lunch with Bell about by-eye morphologies, with me arguing that by-eye results are (a) not repeatable and not objective, and (b) in this day and age, rarely useful for physics. Astronomy maybe, but not physics!
I worked on the USNO-B Catalog cleaning paper this morning, mainly shortening it.
discovered halo subdwarfs via a reduced proper motion diagram this afternoon. I (embarrassed) got a lesson on the subject from Wikipedia.
Rix and I had the terrible realization that most of the 9 epochs of SDSS data that overlap Milky Way globular cluster Palomar 5 are in fact
Apache Wheel data taken at low angular resolution for calibration purposes. So we will have to go to historical or HST data to perform the proper motion measurement. We made some inquiries about both.
We had some conversations with Sebastian Jester (MPIA) about finding quasars and halo giants by their low proper motions. He obtained the proper motion catalog developed in the SDSS multiply-imaged
southern equatorial stripe and is looking at that now.
I worked on the galaxy evolution synthesis this morning. I think I decided that
predictions of the galaxy luminosity function are too dependent on ad-hoc models of star formation and feedback for the luminosity function itself to provide a good test of the CDM model. In principle, the observed luminosity function and the theoretical mass function from CDM (for DM halos) produces a prediction for the Tully–Fisher relation (see recent work by Blanton and Geha) or the fundamental plane, but it seems that even this is loose in the sense that a departure or violation would be viewed by very few as evidence against the fundamental assumptions of CDM. It would be viewed as evidence that dark matter and stars can have different velocity dispersions (which would by no means present any paradox). Maybe with the addition of weak lensing (galaxy–galaxy) constraints it would become a falsifiable prediction of the fundamental CDM model.
On the trains connecting Paris to Heidelberg, I wrote a very specialized (but therefore fast and efficient) expectation maximization code for use in astrometry tweak. I wrote the unit tests first and then wrote and debugged my code until it satisfied the unit tests.
I spent part of the day working through Peebles's latest proto-draft of our synthesis of the evidence for and against the cold-dark-matter paradigm. Peebles nicely laid out (in a philosophical introduction) the difference between
falsification. These are two approaches when one has a dominant model and a lot of phenomenology that is only tied to that model through indirect links, as with evolution in biology and cosmology in physics. Adapters try to learn about the indirect links by adjusting them until there is agreement between the fundamental theory and the data; adapters believe the fundamental model. We, of course, are falsifiers: we believe that you make most progress in this situation by trying to construct empirical tests that are capable of falsifying the fundamental hypotheses.
My only research today (Fete de la Musique here in Europe) was some thought and a conversation with Roweis about scaling up our astrometry.net interface to an industrial scale, possibly with some of the existing image management web services already in place at Google. We already have a way to use Flickr, which is being worked out by Stumm. The challenge is to make this all work with a user who has FITS or RAW files or equivalents, but we think there are some straightforward solutions.
NYU undergraduate Bob Ma accomplished something nice today. He measured the distances of a set of HII regions from the center of M51 (literally with a ruler, working on the HST image), and also their angles relative to a reference angle (ie, their polar coordinates). What did he find? That the angle depends linearly on radius. This is exactly what you expect if the spiral arm pattern in M51 was begun as an m=2 distortion that wrapped up under the action of a flat rotation curve. What raised the m=2 distortion? The interaction with the companion, of course. The slope of the line on the radius–angle plot gives you the time since the raising of the distortion. This is all part of my evil plan to demonstrate that spiral patterns are transient, and to identify their causes, individually, in all sufficiently well-observed galaxies.
I spent the early morning hours learning how the GALEX and 2MASS missions do their astrometry. They are both impressive, both possibly better than any standard astrometric catalog, though both have possible remaining systematics. It is hard to tell from the documentation of either mission what systematics have been explored to date. Certainly the people working on these projects did not see themselves as producing astrometric standards, even though that is exactly what they have produced, beautifully.
I spoke about our efforts to clean up the USNO-B Catalog using computer vision techniques at Rix's group meeting at MPIA yesterday, and worked a bit on the paper. Today was spent in transit to Paris. Gotta love traveling by train at a rate of 1 km every 10 seconds!
Sergey Koposov (with great wizadry) obtained quickly a sample of halo M-giants and their SDSS-USNO proper motions. A very large fraction of halo M giants live in the observed Milky Way substructure, presumably because the substructure is higher in metallicity than the general halo population (and higher metallicity giants are redder/cooler than typical halo K giants). The histograms of proper motions for these halo giants are not promising for statistical proper motion work, although Rix and I have no fear.
I spent time discussing with Rix and then reading Sergey Koposov's (MPIA) draft paper on the luminosity distribution of Milky Way satellites. It is a nice paper because he re-discovers all of the satellites automatically, and can therefore do an objective completeness analysis as a function of luminosity, size, and distance. His results still depend on the (unknown) radial distribution; I argued that he can determine that, with significant uncertainty, from his data directly, or in conjunction with the luminosity distribution.
Spent what little time I had on the weekend writing figure captions and incorporating figures into our USNO-B paper.
I arrived in Heidelberg today and spent some quality time after dinner catching up with Rix on proper motions measured statistically.
I got a lesson today from Mierle on "test-driven development" in which you write unit tests for each code function before you write the code function, you run all your tests each time you commit new code changes, and you thereby never write wrong code. Of course, you need your unit tests to be powerful!
[I was out Tuesday and Wednesday on travel.]
Spent a great day with Ben Weiner (Arizona) visiting. We discussed recent published and unpublished work on outflows from star-forming galaxies at redshifts 0.5 to 1.5. It looks like these are confirming the general view that feedback, and maybe AGN feedback, is important to shaping galaxies.
Weiner, Moustakas, Blanton, and I spent some time discussing measurements of the [O II] 3727 luminosity function, which I measured years ago. This could be done much better now, with far more galaxies and much better redshift coverage. Whenever this comes up, we ask "what's the point?", but Weiner and Moustakas both gave very convincing arguments that it is worthwhile. Now would someone please measure it?
I have been having the realization lately that in almost all cases in which we astronomers use iterated sigma clipping, we should be fitting a mixtures of gaussians model with something like the "expectation maximization" algorithm. It has all the advantages of sigma clipping, plus a clear interpretation in terms of Bayesian statistics. I spent the morning working on that.
I spent my morning nerd-out working on Peebles's and my synthesis of galaxy evolution observations. I was reminded of my philosophical position on all this, with which I have no-doubt bored my legions of blog fans before:
The primary goal of observational astrophysics—as distinct from, say, pure astronomy—ought to be to rule out physical models. Note that all important physical experiments have the property that they ruled out—or could have ruled out—one of the fundamental, dominant theories of their day. In physical cosmology, we have a fundamental, dominant theory for the growth of structure (CDM). Our role as observers is not to bolster this model, or find ad-hoc parameters we can add to the model that make it consistent with the data. Our role is to perform experiments that have the power, even in the face of uncertainties (about how galaxies form, for example), to rule out or substantially modify the fundamental assumptions of this theory. If an experiment does not have the power to rule out the theory, then it can hardly be said to provide substantial support when the results end up in agreement! Thus my primary experimental design criterion is that my observations be capable of falsifying CDM, even after marginalization over unknowns.
I worked on astrometric
tweak today—the code that does precise astrometric calibration following rough calibration by our blind system at astrometry.net. Mierle convinced us at the astrometry.net meeting that we should be using the
ransac algorithm for tweak, and I have become even more convinced since then. Ransac is much better than the astronomer's usual tool of iterative sigma-clipping, especially when (as Mierle advocates) the inlier/outlier decision is made by fitting a model to the residuals that consists of a gaussian core of inliers and a flat distribution of outliers, superimposed.
Jon Barron (UofT, Novartis) is in town and we worked on refining and writing up our work on cleaning the USNO-B1.0 catalog of spurious sourced caused by diffraction spikes and reflection halos in the plate imaging. We have a very conservative system with very few free parameters.
I also fixed some bugs I had introduced into the very fast and reliable pipeline that takes an image (any image) and returns the x,y coordinates of all the stars.
Jim Peebles (Princeton) spent the day at NYU to work on a synthesis of galaxy evolution observations and predictions. We are trying to write a document that draws out tensions with the dominant (CDM) paradigm, and advocates new observations and new theoretical work, in the service of understanding galaxy evolution and the dynamics of the dark sector in the context of the standard model of cosmology (which is extremely well tested on large scales—scales much larger than galaxies). We ended up spending a lot of time talking about massive central black holes, whose abundances, masses, and locations in galaxies all are very constraining on the hierarchical picture of structure formation. If galaxies grow by merging, and the pre-merger galaxies contain black holes, then in general there ought to be non-central and ejected massive black holes. None have been observed, to my knowledge.
I spoke at group meeting about using statistical methods to measure proper motions for substructures at amplitudes below the measurement uncertainty for any individual star, and the possible application to the Sagittarius stream. In the afternoon, Zolotov and I worked on Zolotov's AAS poster, which goes up next week.
I spent the morning facilitating Moustakas's use of astrometry.net in a science project. He is stacking U-band images into a mosaic. We only needed to adjust two things. (1) We needed to insert a cosmic-ray rejection step into his pipeline. We hacked something together but also thought about how we might insert that into astrometry.net as a standard option. (2) We had to split his multi-extension FITS files into individual images. This is clearly not optimal, but the multi-extension files require a lot of technical infrastructure (tying together the tangent points for multiple arrays on the same focal plane, and fitting or solving multiple images simultaneously) that so far has been off the critical path.
In the early morning, I not only fixed all the bugs I created yesterday, but I also worked out some fundamental issues with doing source detection in images with very limited dynamic range, like jpegs off the web and scans of photographic plates. These issues are obvious, but non-trivial: Stars have to be subject to strong non-linearities, and at the bright end, it is the size of the source that is related to flux, not the peak value in the image (which has saturated). Of course these issues are known. What is not known is what to do, in general, in data of which you have little or no knowledge, and when extended sources can be as prevalent as stars. These are the conditions under which astrometry.net operates! Blanton helped me come up with some ideas.
In the late morning and afternoon, Kallivayalil and I agreed to focus on getting a proper motion for Pal 5. This is a project that is hard, but not impossible (we hope), and finite. The first step is to gather all the HST data, survey data, plate data, and random ground-based data that we can find, and to turn those data into precise coordinate lists.
I spent the morning breaking software that has worked for months, by attempting to track down bugs in our simpleXY object-finding and measuring software for astronomical images. I failed to find the bugs, and left the software worse than when I started!
In the afternoon, Nitya Kallivayalil (Harvard) came into town and we discussed the issues involved in measuring statistical proper motions. She is the world's expert, because she has measured the proper motions of the LMC and SMC by comparing stars to quasars in HST images separated by two years. Now the question is: Can we do much more with heterogeneous data (which are worse than HST data) separated by much longer baselines? The issues are severe, especially when we think of the holy grails of the Sagittarius stream and other Milky Way substructure.
Zolotov and I worked on her visualizations of the SDSS data on the Sagittarius stream. As far as we can tell, the current models don't agree with the observations, and the observations in different kinds of stars (which have different systematic issues) don't obviously agree with one another. Zolotov is trying to put together a AAS poster that illustrates all this in a useful way.
Very early in the morning I did surgery on astrometry.net's awesome simpleXY code, which takes as input any astronomical image and returns a reasonable list of sources with x,y positions in the image and approximate fluxes. The code is incredibly simple (hence the name), incredibly fast, has very few free parameters, and essentially always works. It is Blanton's handiwork. It also produces very stable measures of object positions, thanks to astrometry-fu we inherited from the Sloan Digital Sky Survey. What I did was to make the code even more simple. Now I ought to write it up!
I spent the morning reading and digesting a nice paper by Storkey et al. It contains a lot of content that is very analogous to what we are doing with Stumm and Barron on cleaning USNO-B1.0 of diffraction and reflection artifacts. The main difference is that our stuff is highly specialized to USNO-B1.0, which makes our stuff less generally interesting, but much more sensitive to subtle, small, faint, or lightly populated artificial features.
While Barron soldiered on with enhancements to the automatic procedures for diffraction-spike and reflection-halo spurious-source detection, Stumm and I indulged in some mission creep: We looked at methods for identifying dense, linear features in the sky distribution of USNO-B1.0 sources that are not caused by diffraction spikes. The origin of some of these features are edge-on galaxies; the origin of others are incredibly unlikely coincidences of artifacts in multiple bands (inclusion in USNO-B1.0 required detection in multiple bands); and the origin of others are incredibly weird plates we don't understand. We didn't have much success with Hough Transform techniques, but we theorized a ransac-like approach that is promising.
Barron, Stumm, and I made plots of the colors and magnitudes of
stars identified as spurious in the USNO-B1.0 astrometric catalog on the basis that they form morphological crosses and circles centered on the bright stars. Interestingly, they have magnitudes and colors that are not unreasonable, so they could not be identified as spurious on a photometric basis. I presume this is why they made it into the catalog in the first place!
On Friday, Zaritsky (on sabbatical here at NYU) gave a nice talk on methods for determining the sizes of disks. It turns out, perhaps not surprisingly, that they go out a long way. In discussion at the end, it emerged that the "thin disk" test of CDM merger histories might be made stronger by looking at disks at large radius, since larger radii will be more susceptible to gravitational perturbations, and will also extend further into the substructure-filled dark-matter halo.
Today I did some research on issues with the USNO-B1.0 astrometric catalog. UofT undergrads Stumm and Barron are visiting next week to get a paper drafted on their use of computer vision techniques to improve the catalog. We are finding that a combination of computer vision and astronomical techniques can very reliably clean the catalog of a large subset of the non-real sources (which amount to a couple of percent of the 109 entries in the catalog).
I have been traveling for the last two days, hence the lack of posts. I re-learned how to do SDSS CAS queries, to obtain a complete list of the science-grade images used to construct SDSS DR5. We are running astrometry.net on all of them to measure statistics (and build knowledge of the sky, of course). I also worked on various strategies for tweak. I have a very robust one (though it is still vapor-ware) that iteratively fits not the WCS mapping, but the residuals in the current best-fit mapping, and iterates until the residuals have no power at the scales at which we are fitting.
One of our functional tests of astrometry.net is a run through an immense amount of SDSS data, looking to see what percentage we solve (>99 percent) and what percentage solve as false positives (we have never had one, so <3×10−5). I diagnosed a failure today and found this (small cutout shown below). It is an engineering-grade SDSS image, where the PSF is double-peaked, and different stars have different peaks dominant (some left, some right). Now that's what I call bad data. Can we be blamed for failing to solve that?
In research time stolen from exam preparation and proctoring, I worked on trivial matters related to astrometry.net, including web pages, administration of the alpha test, meeting minutes, filing tickets, etc. We got the following in an email from alpha tester Chris Kochanek (OSU), who has been using the system to great effect:
When I describe this [astrometry.net], all the observers want the code to install now. Congrats to all involved!. That improved my day.
In other news, UofT undergrads Jon Barron and Christopher Stumm have been looking at cleaning up the USNO-B1.0 catalog. Here is a plot of a healpix pixel of the sky, in which they have plotted the USNO-B1.0 catalog entries that have colors that aren't consistent with being (correctly measured) stars.
Woah did we work hard! I didn't have any time to even read email, let alone post. In what follows, recall (or learn) that astrometry.net solves the astrometry for an image blind by the following steps: It uses quads of stars to generate large numbers of hypotheses about pointing, rotation, and scale. It attempts to verify those hypotheses using a likelihood ratio (correct vs random lucky hit). Hopefully one verifies. It then tweaks the verified astrometric WCS to something precise.
On day 2, we worked very hard on tweak; we agreed on a scalar and a algorithm/methodology, and Mierle started hacking. Tweak is probably the biggest gap between where astrometry.net is and where it needs to be with its alpha testers. The functionality for professional and amateur users was discussed, and we came up with some ideas for a more modular system to give users more flexibility. We also made some breakthroughs understanding some false positives (almost none of which are our fault, it turns out) and looked at the awesome assemblage of astrometric
footprints produced by David Warde-Farley. Full-team dinner was delicious.
On day 3, Lang convinced us that he has a much better verify than the current one, and we worked out the math and implementation. Lang implemented. We also talked with Christopher Stumm and Jon Barron about their automatic detection of diffraction spike "false stars" in the USNO-B1.0 catalog and how to evaluate their success using astronomical techniques. A paper about cleaning USNO-B1.0 has begun. Mierle continued to hack. I went home for much-needed sleep!
Spent time on the weekend refining the argument about merger rates and small-scale clustering. It relates to my long-term goal of advocating considerations of continuity in the interpretation of cosmological data. I also prepared for Masjedi's thesis defense, which is on this subject.
I wrote the following abstract, for a paper I will probably never write! Note how confidently I can tell you the conclusion before doing the research.
Abstract: Any galaxy–galaxy merger event must be preceded by a period in which the two pre-merger galaxies formed a close pair, therefore any estimate of the merger rate puts a constraint on some galaxy–galaxy cross- or auto-correlation function at small scales. Because the timescales for merging are not known exactly, and because galaxy correlations arise from processes not always related to merging, these constraints are not precise; nonetheless, when made with conservative assumptions, they are still very informative. Here we review published galaxy–galaxy merger rates and use them to put conservative constraints on various galaxy–galaxy cross-correlation functions, with different assumptions about merging timescales. We find that many merger rates make easily falsified predictions for galaxy correlations. Present-day measurements of galaxy clustering on small scales are only consistent with the lowest published merger rates.
I spent a small amount of time today working on the possibility of evolving astrometry.net into a precision astrometry system, a system that can measure proper motions and parallaxes, and that can build standards catalogs; right now it just does rough work.
I spent the morning at Columbia, discussing GALEX, galaxy environments, and related matters with Wu and Schiminovich, and describing the current state of astrometry.net at the relaxed "pizza lunch". Also Schiminovich showed me his lab, which is cool.
Mondays are my worst days, but I did get time to closely read the conclusion to Masjedi's most important thesis chapter and the concluding chapter to the thesis as a whole. He has shown that the growth of luminous red galaxies by merging is very slow at the present day (less than 2 percent, on average, of growth per Gyr), and dominated by
dry mergers with galaxies more luminous than L-star. This is not inconsistent with either theoretical or observational bounds on the merger rate, but it is definitely lower than most other estimates. Masjedi's measurements are unique in that they are very firm upper limits; he has basically made the maximal reasonable assumptions about the mergers; any more conservative or realistic assumptions would lower his inferred rate.
The PRIMUS project has had the bad judgement to put some of my code in the critical path for extracting 1-d spectra from the 2-d images. These images are amazing; they contain huge numbers of tiny spectra! Our philosophy is that all parts of spectral extraction must be automated, including finding the spectra on the image. My code to do this hit its limits this month; I spent today analyzing and fixing it.
I spent a lot of Thursday (sorry, forgot to post) and today interacting with our new alpha testers for astrometry.net. The alpha users are certainly finding a lot of
issues with the service, which is great. Internally, we are debating what new features are highest priority; different alpha users want very different things.
I also spent a lot of time working through draft chapters of Masjedi's thesis, which is a pleasure.
Jim Condon (NRAO) gave the seminar today; he showed us that he can (in principle, and probably in practice) measure the Hubble Constant to few-percent accuracy using maser galaxies. If he can, he will nail things like flatness and the nature of dark energy. This is nicely complementary to the baryon acoustic feature stuff we do here at NYU.
With Zolotov I pitched some projects to undergrads yesterday, including a project to look at galactic substructure with NYU undergraduate Kathy Zhang and to look at the formation of disk-galaxy spiral structure with NYU undergraduate Bob Ma.
Research time today was spent reading closely Masjedi's nearly complete dissertation (he defends his PhD in 12 days), and working with Zolotov on the research techniques that will lead to the start of a dissertation. Zolotov is having fun learning about SQL. Masjedi is having (perhaps less?) fun summarizing four years of hard work.
Pizagno got me back to thinking about our mosaic images of SDSS galaxies, because he is making stellar mass maps. He requires correct inverse variance maps, and we were creating pretty good, but not perfect, maps. (We record our uncertainties in images in units of inverse variance, not traditional root-variance; this is because the inverse variance is what you use for weighted means, and also because then badly measured pixels get values in the uncertainty map of zero, not infinity.) I made them much closer to correct today.
At the afternoon astro seminar, Ann Zabludoff (Arizona but on sabbatical at NYU) told us about galaxy groups at intermediate redshift, in the context of lensing (where there are groups contributing significantly to many of the strong lens systems, something I used to work on back in graduate school), and in the context of the baryon census, where she finds a large fraction of the baryons in stars in the intergroup but intragalactic space (about thirty percent of the stars, and between a few and tens of percent of the total baryonic content, depending on mass).
We decided that we can begin to go alpha with astrometry.net next week!
Wu and I discussed her GALEX-SDSS-Spitzer multi-wavelength study of very low-luminosity galaxies. They have a huge range in properties, from objects with specific star-formation rates that are among the highest I have ever seen, to post-starburst objects, and they have anomalous 8-micron properties. They are very low redshift, so they are generally resolved in GALEX and have non-trivial morphologies.
Roweis and I spent an inordinate amount of time on the phone (thank you unlimited calling plans) working through the astrometric tweak code. Despite our initial skepticism, we decided that almost everything works. Recompile and test, let's go alpha!
I also spent quite a bit of time going through one of Masjedi's thesis chapters, in which he shows at very high confidence that luminous red galaxies are not distributed in their dark matter halos according to an NFW-like or Moore-like radial profile, but something with a much
steeper inner profile. This result is at very high significance, and either means that baryons have a lot of influence on the dark matter distribution, or else that galaxies separate dynamically from the dark matter for dissipational reasons. The latter is more likely, but, quantitatively, both effects are interesting.
Friday was all talk, with Tim McKay (Michigan), Sheldon, and I discussing SDSS measurements of intercluster light in the morning. At group meeting, McKay spoke about his very rapid follow-up of GRBs with ROTSE-III, and Blanton spoke about the Tully-Fisher relation for S0 ("ess zero") galaxies (about which he learned in New Zealand). At lunch, Schiminovich and I looked at some PRIMUS spectra of UV-luminous galaxies; I am not sure he was impressed! Before the seminar, Pizagno and Tatjana Vavilkin (Stony Brook) discussed with me more physical approaches to creating mass images out of the SDSS data, using stellar population synthesis images. At our weekly astro seminar, McKay worked through cosmology with
optically (I think he meaned
visually) detected galaxy clusters, from soup to nuts.
This weekend, I blew large amounts of
testbed data into the pre-alpha astrometry.net engine. All of the images I sent were examples of images that previously failed to solve. This time, almost all of them solved, so I think we are close. Most of these testbed images came from our future
alpha users; I learned that some of them gave us some pretty mean images to solve!
Zolotov and I discussed how we should go about locating the Sagittarius stream in the four-dimensional space of sky position, magnitude, and velocity. This involves making a very large number of plots and browsing them in some sensible way, or else four-dimensional visualization!
Wu and I discussed bad redshifts in SDSS. One of her low luminosity galaxies may have a wrong redshift, which is incredibly rare in the SDSS data. That would be fun! We also discussed backups, since Wu nearly lost all of her work in a minor computer failure here.
The astrometry.net team started discussions of how to archive, serve, back up, and protect our data. None of our current systems scale, even in the short term.
Spent some time arguing with Sheldon about photometric redshifts. He has photometric redshifts that are somewhat degenerate (redshift vs galaxy spectral type), possibly biased, and with large errors, for a very large number of SDSS galaxies. How to determine the true redshift distribution given these data? The right answer is not just to histogram the determined redshifts, but to do something more conservative, given that each individual redshift can be significantly wrong, but there is nonetheless a lot of aggregate information. It is a very nice problem, ill-posed as all good problems are.
Tuesday was a big day for planning the
After Sloan 2 surveys, which are projects to discover extra-solar planets, to take imaging and spectra of stars in the Galaxy, and to measure the baryon acoustic feature at redshifts of about 0.3, 0.6, and 2.0, all with the 2.5-m telescope at Apache Point Observatory (the telescope used for the SDSS). My principal involvement is likely to be in the acoustic feature project, though I am also very interested in the stellar spectroscopy. Because the planning for the survey is largely at the quasi-proposal-writing level, this might not qualify as bloggable research (see
rules at right), but the scientific issues that arise in the description of this project are deep and interesting.
One of the things I am most excited about is the enormous range of possible ancillary science that can be done with the baryon acoustic feature data. These data will include spectra for many hundreds of thousands of luminous red galaxies, densely sampling an enormous volume to redshift of 0.7 or so, and spectra for hundreds of thousands of quasars at redshifts around 2.5. The challenge (though it isn't that challenging!) is to design highly informative measures of galaxy evolution that can be executed with these data even when the project is fully optimized for baryon acoustic feature science (as it will and should be).
Roweis and I spent some time working on the
tweak phase of astrometry, which follows rough determination of pointing, rotation, and scale, and which refines a close solution to a precise solution.
We also both worked on the text of paper zero, which we are drafting with a computer-science focus and in the flashy style of Science or Nature. We hope to finish and submit while we are in alpha. We still aren't in alpha, but we are very, very close (and our alpha testers are getting, well, testy).
It was clusters all the time on Friday, as Moustakas's collaborators on the ESO Distant Cluster Survey have been in town all week working on Spitzer data. They discussed some of their results, past and future, at group meeting. In the afternoon, Markevich (CfA) told us about physical measurements of astrophysical plasma that are possible in clusters, particularly in clusters like the famous
bullet cluster where there are fast-moving substructures creating shocks.
Zolotov and I visited Kathryn Johnston (Columbia) this afternoon to talk Sagittarius stream. Johnston convinced us that what we have (in the public SDSS spectroscopy) may be interesting and constraining on models. We also discussed with Sanjib Sharma (Columbia) some of his very novel methods for finding substructure in N-dimensional data sets, using information-based metrics.