Today I had the great pleasure to serve on the PhD defense committee for Steven Mohammed (Columbia). Mohammed worked on the post-GALEX-misson GALEX data in the Milky Way plane. We took the data specially as part of the Caltech/Columbia takeover of the mission at the end of the NASA mission lifetime. Mohammed and my former student Dun Wang built a pipeline to reduce the data and produce catalogs. And Mohammed has done work on photometric metallicities and the abundances in the Milky Way disk using the data. It is a beautiful data set and a beautiful set of results.
2021-09-28
2018-06-27
Dr Dun Wang
It was with the greatest pleasure that I participated in the PhD defense today of Dun Wang (NYU), who has been my student this last five years. He has done a remarkable body of work: He has a very good model for the NASA Kepler data, using pixels to predict other pixels. He has a completely novel method for image differencing, where he doesn't need a reference image (and instead uses a time series of images to build a predictive model). And he has a data-driven model for the pointing (as a function of time) and sensitivity map for the last days of the NASA GALEX mission, where the camera was scanned rapidly back and forth across the Galactic Plane.
I have many things to say about this work, but here are just a few: Wang's work encouraged me to think about extremely big models! I think his model of the Kepler data has more free parameters than any model of anything, ever (literally close to a trillion). Gotta love convexity! He used his image differencing to discover completely new microlensing events in the K2 Campaign 9 data. He has the first ever ultraviolet maps of the Milky Way disk plane at this depth and resolution. It is a very impressive body of work.
Congratulations Dr Wang. And thank you!
2018-04-26
#GaiaDR2 zero-day workshop, day 2
It was a little harder to get up this morning after yesterday's 13-hour day, but I still made it in early for the second day of the Gaia DR2 zero-day workshop. We had about 70 yesterday and still maybe 50 today; the room was at capacity and we had people all over the 3rd floor of the (very generous) Flatiron Institute.
Dustin Lang (Toronto) coined the name "BetterTogether" for a project that Megan Bedell (Flatiron) and I started to find all the comoving pairs that can be confidently identified in the data. This kind of work isn't new: Semyeong Oh (Princeton) had big impact with her comoving-pair work in Gaia DR1. But what's new is the idea of using the co-moving-ness to betterize the parallaxes of both stars, and in particular the less luminous (and hence noisier) star. So pairs that are WD-MS or MS-RGB are most valuable! This project builds conceptually on work I did with Morgan Fouesneau (MPIA) and Hans-Walter Rix (MPIA) in the TGAS–PanSTARRS overlap.
The issue is that you can't trivially look at every pair in a 1.3-billion-star catalog. There are 1e18 pairs! And even deciding not to look at a pair takes time. So Lang started to build us a very nice data structure for doing the two-point work while Bedell looked at the restricted sample that matches the Kepler targets.
In the mid-day check-in, some really impressive things were shown. Lang and David Schiminovich (Columbia) showed a set of UV color–magnitude diagrams that literally caused the audience to gasp. Stars look so different in the UV! And there are stars where there “shouldn't be”, because of binarity or chromospheric activity or something. So much structure! Kohei Hattori (Michigan) showed a hyper-velocity star that looks like it was launched from the disk towards the Galactic Center. Tim Morton (Princeton) showed that the Gaia stellar radii are good enough to bring out the radius gap in Kepler exoplanets. Ana Bonaca (Harvard) and Adrian Price-Whelan (Princeton) showed that the gaps in the GD-1 stellar stream are really there, and also had hints of kinematic offsets that might indicate dark-matter substructure!
On a more astrophysical note, Kareem El-Badry (Berkeley) spent yesterday and today becoming an expert on white-dwarf physics and was able to give a reasonable, quantitative explanation of the (exquisite, surprising) morphology of the white-dwarf part of the color-magnitude diagram, including a generative model! He finds that even if the IMF and star-formation history are monotonic, the white-dwarf mass distribution is not, because of wiggly initial-mass–final-mass relations. That gets much (but not all, I'm interested to note) of the multi-modal structure in the diagram.
2017-10-11
WDs in Gaia, M33, M stars, and more
In our weekly parallel-working Gaia DR2 prep meeting, two very good ideas came up. The first is to look for substructure in the white-dwarf sequence and see if it can be interpreted in terms of binarity. This is interesting for two reasons. The first is that unresolved WD binaries should be the progenitors of Type Ia supernovae. The second is that they might be formed by a different evolutionary channel than the single WDs and therefore be odd in interesting ways. The second idea was to focus on giant stars in the halo, and look for substructure in 3+2-dimensional space. The idea is: If we can get giant distances accurately enough (and maybe we can, with a model like this), we ought to see the substructure in the Gaia data alone; that is: No radial velocities necessary. Of course we will have radial velocities (and chemistry) for a lot of the stuff.
In the stars group meeting, many interesting things happened: Anna Ho (Caltech) spoke about time-domain projects just starting at Caltech. They sure do have overwhelming force. But there are interesting calibration issues. She has accidentally found many (very bright!) flaring M stars, which is interesting. Ekta Patel (Arizona) talked about how M33 gets its outer morphology. Her claim is that it is not caused by its interaction with M31. If she's right, she makes predictions about dark-matter substructure around M33! Emily Stanford (Columbia) showed us measurements of stellar densities from exoplanet transits that are comparable to asteroseismology in precision. Not as good, but close! And different.
In the afternoon I worked on GALEX imaging with Dun Wang (NYU), Steven Mohammed (Columbia), and David Schiminovich (Columbia). We discussed how to release our images and sensitivity maps such that they can be responsibly used by the community. And Andrina Nicola (ETH) spoke about combining many cosmological surveys responsibly into coherent cosmological constraints. The problem is non-trivial when the surveys overlap volumetrically..
2017-06-08
music and stars
First thing, I met with Schiminovich (Columbia), Mohammed (Columbia), and Dun Wang (NYU) to discuss our GALEX imaging projects. We decided that it is time for us to produce titles, abstracts, outlines, and lists of figures for our next two papers. We also realized that we need to produce pretty-picture maps of the plane survey data, and compare it to Planck and GLIMPSE and other related projects.
I had a great lunch meeting with Brian McFee (NYU) to catch up on his research (on music!) and ask his advice on various time-domain projects I have in mind. He has new systems to recognize chords in music, and he claims higher performance than previous work. We discussed time-series methods, including auto-encoders and HMMs. As my loyal reader knows, I much prefer methods that deal with the data probabilistically; that is, not methods that always require complete data without missing information, and so on. McFee had various thoughts on how we might adapt methods that expect complete data for tasks that are given incomplete data, like tasks that involve Kepler light curves.
2017-04-19
after SDSS-IV; red-clump stars
At Stars group meeting, Juna Kollmeier (OCIW) spoke about the plans for the successor project to SDSS-IV. It will be an all-sky spectroscopic survey, with 15 million spectroscopic visits, on 5-ish million targets. The cadence and plan are made possible by advances in robot fiber positioning, and The Cannon, which permits inferences about stars that scale well with decreasing signal-to-noise ratio. The survey will use the 2.5-m SDSS telescope in the North, and the 2.5-m du Pont in the South. Science goals include galactic archaeology, stellar systems (binaries, triples, and so on), evolved stars, origins of the elements, TESS scientific support and follow-up, and time-domain events. The audience had many questions about operations and goals, including the maturity of the science plan. The short story is that partners who buy in to the survey now will have a lot of influence over the targeting and scientific program.
Keith Hawkins (Columbia) showed his red-clump-star models built on TGAS and 2MASS and WISE and GALEX data. He finds an intrinsic scatter of about 0.17 magnitude (RMS) in many bands, and, when the scatter is larger, there are color trends that could be calibrated out. He also, incidentally, infers a dust reddening for every star. One nice result is that he finds a huge dependence of the GALEX photometry on metallicity, which has lots of possible scientific applications. The crowd discussed the extent to which theoretical ideas support the standard-ness of RC stars.
2017-03-21
half-pixel issues; building our own Gibbs sampler
First thing in the morning I met with Steven Mohammed (Columbia) and Dun Wang (NYU) to discuss GALEX calibration and imaging projects. Wang has a very clever astrometric calibration of the satellite, built by cross-correlating photons with the positions of known stars. This astrometric calibration depends on properties of the photons for complicated reasons that relate to the detector technology on board the spacecraft. Mohammed finds, in an end-to-end test of Wang's images, that there might be half-pixel issues in our calibration. We came up with methods for tracking that down.
Late in the day, I met with Ruth Angus (Columbia) to discuss the engineering in her project to combine all age information (and self-calibrate all methods). We discussed how to make a baby test where we can do the sampling with technology we are good at, before we write a brand-new Gibbs sampler from scratch. Why, you might ask, would any normal person write a Gibbs sampler from scratch when there are so many good packages out there? Because you always learn a lot by doing it! If our home-built Gibbs doesn't work well, we will adopt a package.
2016-12-02
stars, disruption, photometric redshifts
Today began with a meeting about GALEX, where Steven Mohammed (Columbia) showed that there is great metallicity information in the overlap of GALEX and Gaia, and we discovered that something must be seriously wrong with the astrometry in our re-calibration of the data.
Andy Casey (Cambridge) organized a phone meeting in which a bunch of us discussed possible scientific exploitation of the data in the ESO HARPS archive, which contains thousands of stars, each of which has tens to thousands of epochs, each of which is signal-to-noise of hundred-ish, and resolution of 100,000. Incredibly huge amounts of data. Huge. Casey asked each of us to describe low-hanging fruit, and take on short-term tasks. One thing we might do is re-factor the archive into something more directly useful to investigators.
Sjoert Van Velzen (JHU) gave the astrophysics seminar about tidal disruption events. He has a great set of results, starting from search and discovery, going through theory and models, and continuing on to multi-wavelength follow-up. The most intriguing result is that the TDEs are amazingly over-represented in post-starburst (E+A) galaxies (which I used to work on). It is hard to imagine any origin for TDEs that would so strongly concentrate them into these environments. It makes me wonder whether the things they are seeing aren't TDEs at all?
After the seminar, Boris Leistedt (NYU) posted to the arXiv our new paper on photometric redshifts. The idea is that we use what we know about Doppler Shift and bandpasses and calibration of photometry, but let the galaxy SEDs themselves be inferred, latent variables. This combines the best properties of machine-learning methods (that is, flexibility, non-parametrics) with the best properties of template-based methods (that is, regularization to physically realizable models, a generative model, and interpretability). It seems to work very well!
2016-10-14
GALEX, Gaia, and MCMC
Early in the morning, I met with Dun Wang (NYU), Steven Mohammed (Columbia), and David Schiminovich (Columbia) to discuss our GALEX imaging of the Galactic Plane. We gave Wang and Mohammed tasks of writing titles and abstracts for their papers on the subject. Also, Mohammed showed us his exploration of the GALEX–TGAS match, which looks like it is filled with good stuff.
In the afternoon, Dan Foreman-Mackey (UW) and I met to discuss exoplanet results, where Foreman-Mackey has new results on multiplicity based on ABC inference. We followed this with parallel work on our Data Analysis Recipes tutorial on MCMC inference. We re-organized some of the content, reduced scope very slightly, and tried to close issues.
I also worked on posterior samplings for star distances, given parallaxes. I am using Simple Monte Carlo, with two techniques, one that works well for high signal-to-noise parallaxes, and one that works well for low signal-to-noise. The issues are very subtle; a uniform-density prior has a lot of very bad properties in parallax space. I got something working and posted a gif on the twitters.
2016-02-12
following up GW150914; new disrupting cluster
The day started with Dun Wang, Steven Mohammed, David Schiminovich, and I meeting to discuss GALEX projects. Of course instead we brain-stormed projects we could do around the LIGO discovery of gravitational radiation. So many ideas! Rates, counterparts, and re-analysis of the raw data emerged as early leaders in the brain-storming session.
Adrian Price-Whelan crashed the party and showed me evidence he has of a disrupting globular cluster. Not many are known! So then we dropped everything and spent the day getting membership probabilities for stars in the field. The astrophysical innovation is that Price-Whelan found this candidate on theoretical grounds: What Milky Way clusters are most likely to be disrupting? The methodological innovation is that we figured out a way to do membership likelihoods without an isochrone model: We are completely data-driven! We fired a huge job into the NSF supercomputer Stampede. Holy crap, that computer is huge.
2016-02-05
modeling spacecraft imaging
Dun Wang, Steven Mohammed (Columbia), David Schiminovich and I met to discuss GALEX. Wang has absolutely beautiful images of the GALEX flat, and he can possibly separate the flat appropriate for stars from the flat appropriate for background photons. We realized we might need some robust estimation to deal with transient reflections from bright stars.
Matthew Penny (OSU) showed up and distracted us onto K2 matters; Penny is involved in our efforts to deliver photometry from the crowded fields of K2 Campaign 9 in the Milky Way bulge. Wang showed his CPM-based prediction of the crowded field in K2C0 test data, where he has an absolutely beautiful time-domain image model. This is like difference imaging, except that the prediction is made not from a master image, but from the time-domain behavior of other (spatially separated) pixels. The variable stars and asteroids stick out dramatically. So I think we are close to having a plan.
2016-01-22
candidate Wang
Today Dun Wang (NYU) passed his oral candidacy exam. His PhD thesis is pretty ambitious: A self-calibration of the Kepler Spacecraft main-mission data, an ultraviolet map of the Milky Way from GALEX data (which he will also self-calibrate), and photometry in crowded fields for the K2 mission!
2015-11-06
writing and talking
In another day with limited motility, one small victory was drafting an abstract for upcoming work on The Cannon with Andy Casey (Cambridge). I like to draft an abstract, introduction, and method section before I start a project, to check the scope and (effectively) set the milestones. We plan to obtain benefit from both great model freedom and parsimony by using methods from compressed sensing.
I also had a few conversations; I spoke with Dun Wang and Schiminovich about Wang's work on inferring the GALEX flat-field. We made a plan for next steps, which include inferring the stellar flat and the sky flat separately (which is unusual for a spacecraft calibration). I spoke with Magland about colors, layout, and layering in his human interface to neuroscience data. This interface has some sophisticated inference under the hood, but needs also to have excellent look and feel, because he wants customers.
2015-10-16
GPs in the Fourier domain
The day started with Dun Wang, Steven Mohammed (Columbia), David Schiminovich and I discussing the short-term plans for our work with GALEX. My top priority is to get the flat-field right, because if we can do that, I think we will be able to do everything else (pointing model, focal-plane distortion model, etc.).
Over lunch, Greengard and Jeremy Magland (SCDA) “reminded me” how the FFT works in the case of irregularly sampled data. This in the context of using Gaussian-process kernels built not in real space but in Fourier space. And then Greengard and Magland more-or-less simultaneously suggested that maybe we can turn all our Gaussian process problems into convolution problems! The basic idea is that the matrix product of a kernel matrix and a vector looks very close to a convolution, and the product with the inverse matrix looks like a deconvolution. And we know how to do this fast in Fourier space. This could be huge for asteroseismology. The log-determinant may also be simple when we think about it all in Fourier space. We will reconvene this conversation late next week.
2015-09-16
space, space, sports, and black holes
Group meeting today was a pleasure. Laura Norén (NYU) talked about ethnography efforts across the Moore–Sloan Data Science Environments, including some analysis of space. This is relevant to my group and also the NYU Center for Data Science. She talked also about the graph of co-authorship that she and a team are compiling, to look at the state of data-science collaborations (especially interdisciplinary ones) before, during, and after the M-S DSE in the three member universities and also comparison universities. There was some excitement about looking at that graph.
Nitya Mandyam-Doddamane (NYU) showed us results on the star-formation rates in galaxies of different optical and ultraviolet colors. She is finding that infrared data from WISE is very informative about hidden star formation, and this changes some conclusions about star formation and environment (local density).
Dun Wang talked about how he is figuring out the pointing of the GALEX satellite by cross-correlating the positions of photons with the positions of stars. This fails at bright magnitudes, probably because of pile-up or saturation. He also showed preliminary results on the sensitivity of the detector, some of which appear to be different from the laboratory calibration values. The long-term goal is a full self-calibration of the satellite.
Dan Cervone (NYU) spoke about statistics problems he has worked on, in oceans and in sports. We talked about sports, of course! He has been working on spatial statistics that explain how basketball players play. We talked about the difference between normative and descriptive approaches. Apparently we are not about to start a betting company!
Daniela Huppenkothen spoke about the outburst this summer of V404 Cygni, a black hole that hasn't had an outburst since 1989. There are many observatories that observed the outburst, and the question (she is interested in) is whether it shows any oscillation frequencies or quasi-periodic oscillations. There are many spurious signals caused by the hardware and observing strategies, but apparently there are some potential signatures that she will show us in the coming weeks.
2015-05-15
exoplanet dynamics, sucking
In group meeting, Dun Wang talked about astrometric calibration of the GALEX Satellite, and Kat Deck (Caltech) talked about the dynamical evolution of exoplanetary systems. She pointed out that we naively expect lots of planets close in period to be locked in resonances, but in fact such resonances are rare, empirically in the Kepler sample. She has explanations for this involving the evolving proto-planetary disk.
After lunch, Deck gave the astro seminar, on planetary system stability and the Kepler planets. She discussed chaos, stability, and heuristic stability criteria. One interesting thing is that there really is no non-heuristic stability criterion: We think of a planetary system as "stable" if there are no catastrophic, order-unity changes to any of the orbital osculating elements. That's not really an equation! And at the talk there was some discussion of the point (counter-intuitive and important) that a system can be stable (by our astronomer definition) for far, far longer than the Lyapunov time. Awesome and important.
At the end of the day, Foreman-Mackey and I made the (astonishing) decision to abort and fail on our NASA proposal: We just ran out of time. I am disappointed; we have an eminently fundable pitch. That said, we just didn't start early enough to make that pitch at the level we wanted. Not sure how to feel about it, but I sure need to catch up on sleep!
2015-04-24
self-calibration of GALEX, regularizing a PSF model
At group meeting, Dun Wang showed his first results from his work on the GALEX photons. He showed some example data from a scan across the Galactic plane and back, performed by Schiminovich in the spacecraft's last days. The naively built image has a double point-spread function, because the satellite attitude file is not quite right. Wang then showed that on second (or even half-second) time scales, he can infer the pointing, either by cross-correlating images, or else correlating with known stars. So the satellite pointing could be very well calibrated with a data-driven model. That's awesome!
Also at group meeting, Vakili discussed taking his model of the point-spread function up to super-resolution (that is, modeling the PSF at a resolution higher than the imaging data with which we constrain it). The model is super-degenerate, so we are in the process of adding (willy nilly) lots of different regularizations. My "big idea" at the meeting was to model the PSF using only smooth functions, because we know (for very deep physical reasons) that the PSF cannot have features or structure below some fundamental angular scale (set by the diameter of the telescope aperture!).
2015-03-24
dissertation transits
Schölkopf, Foreman-Mackey, and I discussed the single-transit project, in which we are using standard machine learning and a lot of signal injections into real data to find single transits in the Kepler light curves. This is the third chapter of Foreman-Mackey's thesis, so the scope of the project is limited by the time available! Foreman-Mackey had a breakthrough on how to split the data (for each star) into train, validate, and test such that he could just do three independent trainings for each star and still capture the full variability. False positives remain dominated by rare events in individual light curves.
With Dun Wang, we discussed the GALEX photon project; his job is to see what about the photons is available at MAST, if anything, especially anything about the focal-plane coordinates at which they were detected (as opposed to celestial-sphere coordinates). This was followed by lunch at facebook with Yann LeCun.
2014-06-13
raw data, uv flares
First thing in the morning, I broke it to Jeffrey Mei that his results on spectral features associated with MW dust are almost certainly strongly affected by the SDSS spectroscopic calibration pipeline. He took it well, and we realized that we can just re-run our analysis on the completely uncalibrated spectrograph counts! This might just work. He is tasked with understanding how to access the raw counts from the SDSS-III spectroscopic pipeline.
At arXiv coffee, Patel summarized two papers by Gezari and collaborators on discoveries in GALEX and PanSTARRS time-domain data: A shock breakout flash at the start of a supernova and a putative tidal-disruption flare from a star destroyed by a close encounter with a black hole. These results provide a strict lower limit to what we might find in a search of the GALEX photon list.
2014-06-11
flares in GALEX; non-Gaussian Processes
I spent an hour this morning with Patel, discussing possible projects with the GALEX photon catalog. We tentatively decided to look for flashes or short-lived brightness increases, either on top of known sources or else in isolated regions. This project is interesting in itself, exercises the catalog, and also provides useful information for improving calibration and the instrument model.
At group meeting, Vakili showed a variant of a Gaussian Process, in which the latent function is still drawn from a Gaussian, but the data are related to the latent function by a fatter-tailed (t) distribution. He showed a beautiful simulation and output in which outliers totally mess up a Gaussian Process fit but are just straight-up ignored by the modified method. At this point, I don't even remember what the method is called, but it is extremely relevant to the quasar-fitting work by Mykytyn.