Hogg's Research

2018-09-28

AstroFest, day 3

Today was the third and final friday of the Gotham AstroFest series, in which we have a very large fraction of the entire astrophysics community in New York City give short talks. This was at NYU, and had contributions from NYU, AMNH, and CUNY scientists. There were a huge number of interesting results in the day. One of the most remarkable things about the day is that fully one quarter of the talks were about black holes. Between NYU and CUNY, there is a lot of research going on related to black holes: Their formation, primordial black holes, their binary dynamics, gravitational-wave signatures, and so on. That's excellent.

A few random highlights for me included: Evidence for weather on brown dwarfs as a function of temperature and gravity by Vos (AMNH), and (relatedly) comparisons between planet and brown-dwarf spectra by Popinchalk (CUNY). It really does appear that there are no strong differences between brown dwarfs and planets (something I discussed with Oppenheimer, AMNH, at lunch). Gandhi (NYU) showed some chemistry and orbits work she has done with Ness (Flatiron) before coming to NYU; that's very related to my interests! Williamson (NYU) visualized a linear SVM, which is beautiful (and old-school). MacFadyen (NYU) convinced us beautifully that his models of the NS—NS merger are really the best!

There was lots on dark-matter detection and dark-matter candidates, including even baryonic and black-hole types. And Tinker (NYU) showed beautiful satellite-galaxy statistics that he got by stacking and background-subtracting galaxy counts in the Legacy Survey imaging for DESI.

If you want to see the full slide deck for the event, it is here.

2018-09-27

how to write a discussion section

In a low-research day, a highlight was a long conversation with Bonaca (Harvard) about the writing of her paper on the GD-1 stream interaction. We discussed structure, and especially the discussion. In a discussion, I like a humble sandwich on proud bread: Start by saying what's most impressive about what we've done, then go into caveats, limitations, approximation wrongness, and the consequences of all that. And then end on a positive note about what kinds of great new things this work will enable going forward.

Late in the day, Alex Kusenko (UCLA, IPMU) spoke about a very wide range of subjects. He claims to have a full explanation for why we don't see the cutoff in the gamma-ray occurrence rate required by photon–photon interactions with the infrared background. He claims that the gamma rays we see from blazars are really reprocessed from cosmic rays. Plausible! But I would need to know a lot more. He also claims to have a way to naturally make primordial black holes in the end stages of inflation, and make all of the dark matter that way. That's interesting. Unfortunately it was such a long and tiring day I couldn't get it together to really check either of these ideas carefully.

2018-09-26

data science for stars; phase space

Our weekly Stars meeting at Flatiron was a pleasure today, as it usually is. Angus (Columbia) and Contardo (Flatiron) are looking at the possibility that we might be able to deblend binary and overlapping stars in the TESS data by their light curves alone. That's crazy, but just crazy enough that I love it! We discussed different ways they might get a training set for this. Luger (Flatiron) asked whether it might be possible to figure out the ell and em (spherical-harmonic order) of the asteroseismic modes by using projections onto transits. That also led to some good discussions about possible methods; many of the crowd liked the ideas that look like lock-in amplification. Marchetti (Leiden) gave us a nice discussion of the high-velocity star results from Gaia DR2. It's too early: The really exciting results will come in data releases 3 and 4 when the magnitude limit for the RVS data gets fainter.

Matt Buckley (Rutgers) showed Adrian Price-Whelan (Princeton) and me his results on measuring phase-space volumes of bound and disrupted objects. The idea is that you might be able to reconstruct the mass of a disrupted object, and say whether it was dark-matter dominated. And get all the attendant dark-matter-theory consequences of that. He showed (unsurprisingly) that observational noise increases the phase-space volume that you naively measure. So we discussed how to approach this. If we are frequentists, maybe we can just ”greedily“ correct the measurements in the direction that lowers the phase-space volume? If we are Bayesians, we have to make more assumptions, I think!

2018-09-25

structure of all models, ever; correlation-function representation

Early in the day I had a long conversation with Leistedt (NYU) about the philosophy of our machine-learning projects. We refined further our view that the machine learning should be part of a larger causal structure that makes sense. My position is that you can think of most (hard) physics problems as having some kind of generalized graphical model with a three high-level boxes. One is called “things I know well but don't care about”, which is things like noise model, instrument model, and calibration parameters. Another is called “things I don't know and don't care about” which is things like foregrounds, backgrounds, and other nuisances. And the last is called “things I don't know and deeply care about”. This last one is our rigid physics model. And the middle one is where the machine learning goes! If we could build models like this very generally, we would be infinitely powerful.

At mid-day, Storey-Fisher and I talked about all the things we could do if we had a correlation function that is not values-in-bins, but was a linear combination of functions. We could look for cosmological gradients. We could do clustering multipoles at small scales, we could estimate the correlation function and power spectrum simultaneously, we could extract Fisher-optimal summary statistics for cosmological parameter estimation. And all these things are possible with our new correlation-function estimator. Next step: Getting the code fast enough to do non-trivial tests.

In the astro seminar at NYU, Savvas Koushiappas (Brown) showed us weak but very interesting evidence that maybe there is a dark-matter annihilation signature in the NASA Fermi data on the Reticulum II dwarf galaxy. Obviously this is incredibly important if it holds up as more data and better calibrations come.

2018-09-24

writing; not ready for TESS

I got some actual writing time in today! I worked on places in the Birky (UCSD) paper (on M-dwarf spectral models) where Birky had left me notes marked "HOGG". That's a great tool: She leaves "HOGG" notes; I search for them in my text editor, and I make the relevant changes or add the relevant text.

Late in the day I had a great conversation with Ben Pope (NYU) about things we can do right now or very soon with TESS artificial data or the first data release of full-frame images. We talked about dimensionality reduction methods, like the robust
PCA methods from Candès and related methods that use convex optimization. We also talked about independent components analysis. In general, when the first data arrive, there will be lots of low-hanging fruit. We also discussed what could be done in advance, with the available artificial data.

2018-09-23

finishing a paper; latents

I dusted off the draft of my paper with Eilers (MPIA) and Rix (MPIA) about spectrophotometric measurements of red-giant distances or parallaxes using Gaia SDSS APOGEE, 2MASS, and WISE. It is nearly done! But we put it on ice while Eilers finished other things. I worked through more than half of the text, making notes on what small things remain to do.

The biggest to-do item? We have a linear model (for the log distance or log parallax or absolute magnitude). That's sweet, because it is simple, and it is interpretable, at least partially. Now we have to make that true by interpreting. Interpreting a linear model is harder than fitting a linear model!

I also had conversations with Storey-Fisher (NYU) about models for the correlation function and Price-Whelan (Princeton) about Milky Way non-equilibrium dynamical models. On the former, we discussed the difference between the correlation function and any particular estimate of the correlation function. It's a bit complex, because I'm not sure there is even agreement in the community about what would be considered the true latent correlation function in the low-ish redshift Universe.

2018-09-21

stream-as-torus; TESS FFIs

I met up early with Price-Whelan (Princeton) to work on the chemical-tangents method papers. This work devolved into rearranging and organizing into categories the to-do list, using GitHub's project tools. That was useful! But it felt a bit like we didn't get anything done. I know that isn't true!

A bit later in the morning we called Jo Bovy (Toronto) to get some advice for Lauren Anderson (Flatiron) on fitting streams in the Milky Way halo. I had been summarizing one of Bovy's papers as saying that streams are close to orbits (that is, you can fit a stream as an orbit) but Bovy corrected us: His paper shows that streams are close to tori. That is, you can expect all the stars in the stream to have similar actions or invariants, but they will not line up as a line on the torus the same way that a single segment of a single orbit would. Duh! That makes good sense and suggests a beautifully simple method for modeling streams with tilted torus sections. I think I almost know how we might do that.

I also checked in with the group working on NASA TESS full-frame images (FFIs), led by Ben Montet (Chicago), who have been hacking at Flatiron all week. They intend to reformat the full-frame images into manageable (and more useful) data objects, extract aperture photometry flexibly, and perform best-in-class de-trending using other stars or other pixels, in the spirit of many things we have done over the years with Kepler data. They really look like a team that might take over the world! For context: The TESS Mission plans to release the raw FFIs with no proprietary period, and they plan to leave it to the community to build open-source (or not!) data-analysis tools around them. Go team!

2018-09-20

GD-1 and chemical tangents

Tuesdays and Thursdays are lower on the research this semester! But I did get in two excellent discussions. The first was with Bonaca (Harvard) who has made an absolutely great visualization that compares the Gaia data on GD-1 and her model for GD-1. I think this figure might get featured in a lot of talks! We are still checking things, but it looks great. We discussed what would be the final scope of her paper, because—as with all projects—there is a huge possible scope but we need to finish a paper soon! I'm happy with the scope and it seems achievable and sensible. The big issue is that the constraints we have on the perturber than interacted with GD-1 come from a model that has toy aspects to it, while the full generative stream model is expensive enough that we don't want to go there for inference just yet. Soon, but not for this paper.

Over a late lunch I discussed many things with Price-Whelan (Princeton), both about GD-1 and about our chemical tangents project. On the latter, we discussed (for approximately the millionth time) how to describe the project most compactly. This project is strange enough for the typical astronomer, that we have to think carefully about how we present it. There are a lot of things that sound right but are wrong. And I am a huge believer in repeatedly re-describing projects. I think every time you go through it, you learn something new, and improve your presentation. This is a huge benefit of fully Open Science.

In that spirit: We are trying to find the surfaces in phase space along which the distribution of stars in abundance space is constant. Not the abundance is constant, but the distribution in abundances. Those surfaces contain the orbits! In some sense it comes down to the point that the joint distribution in actions and abundances is not separable, so the abundances can inform you about the actions! But that description is too terse. And Rix likes to say: Stars don't change their abundances as they orbit! So if you have drawn orbits through phase space that would require abundance changes, either your population isn't mixed or else you are wrong about those orbits.

2018-09-19

dynamics and chemistry

Today Kathryn Johnston (Columbia) test-drove a group meeting at Flatiron on Dynamics, to which I was honored to be invited. We went around the table and described our current dynamics-related projects. After that, it was Stars Meeting, which was its usual hugeness. At the suggestion of its (rotating) organizers, we are experimenting with different ways of making sure many voices are all involved in the conversation. That's a hard problem!

As Stars meeting many interesting things happened. A highlight for me was Adrian Price-Whelan (Princeton) describing work done at Aspen in the last few weeks on the Orphan stream. It looks for dynamical and chemical reasons like a disrupted dwarf galaxy, and it may fully wrap the Galaxy. Another highlight was a contribution from Victor Debattista (UCLAN) looking at chemical abundances in toy (that is, non-cosmological) simulations of star-forming disk galaxies. He has a new explanation for the bimodality between alpha-rich thick disk and alpha-poor thin disk, and his explanation is general, so it implies (as he explicitly predicts in his new paper) that the bimodality will be observed in all disk galaxies! That's exciting. Of course it is hard to observe.

In other news, Matthew Buckley (Rutgers) showed me really beautiful results, in which he can measure the mass of a globular cluster by using phase-space density or volume information, even in the presence of real data issues. The reason it is hard is that the data quality is extremely anisotropic in phase space. It looks extremely promising. I want to figure out how this relates to old-school methods, like virial methods and caustic methods.

2018-09-18

large-scale structure

Tuesdays are low-research days! But I did have a good conversation with Storey-Fisher (NYU) about our correlation-function estimator, and how to precisely test it. It has so many applications! We also discussed how our three projects fit together: Correlation-function estimation, adversarial mock catalogs, and searching for anomalies in the large-scale structure. The middle project—adversarial mocks—is about making mocks that have systematics that would defeat current systematics correction, and also making methods that would defeat even those mocks.

2018-09-17

out sick

Mondays don't seem to agree with me!

2018-09-16

there's no such thing as a Jeans model!

The Jeans Equations are remarkable: They relate moments and integrals of distribution functions to the underlying gravitational potential (or really force law), for phase-mixed populations. They are true for any distribution function! But they are equations, and they are not models. As my loyal reader knows, for me a model is a likelihood function!

When people do what is called Jeans modeling, they turn the equations into some procedure for estimating the gravitational potential (or force law or mass density). And although the Equations are independent of distribution function, the performance of this heuristic procedure—that goes from velocity moments to gravitational model parameters or densities—has statistical properties that do depend strongly on the distribution function. That is, you can't make a probabilistic statement (like a measurement and an uncertainty) of anything (like a density at the Milky Way disk midplane) without assuming things about the distribution function.

And because the Jeans Equations are independent of the distribution function, it is tempting to claim or believe that the results of the inference are also independent of the DF, which they aren't. There is no procedure you can write down that isn't. I spent time this weekend writing words about this, for reasons I can't currently understand.

2018-09-14

Gotham AstroFest, day 2

As my loyal reader will recall, there are AstroFest events this September at Columbia (last week), Flatiron (today), and NYU (in two weeks). Todays meeting was long but excellent. I learned many things and was pleased to see all the new faces (so many new faces)! Here are a few personal highlights:

Shy Genel (Flatiron) showed that the details of star formation and feedback affecting a simulated galaxy disk or stars is very sensitive to the initial conditions or perturbations to the conditions made mid-simulation. That caused me to wonder if it is going to be very hard to infer things about galaxies from their observed properties! But Foreman-Mackey (Flatiron) pointed out that the sensitivity might be high but also highly structured, so not necessarily a problem. Good point; but it might take a lot of simulations to find out! Whatever the case, this is an excellent line of research.

Francisco Villaescusa-Navarro (Flatiron) described a project to see if, in the non-linear regime of large-scale structure evolution, the one, two, and higher-point functions, all combined, contain as much information as the one- and two-point functions in the linear regime. That is: What is the information content in the observables? This is, in some sense, the key question of cosmology at the present day! And relates to things I have been thinking about (but doing nothing about) for years.

Suvodip Mukherjee (Flatiron) delivered a beautifully simple (and yet novel) idea: He is looking at all the cosmological observables with gravitational-wave sources that we have with galaxies and the CMB. That's clever! It includes the GW LSW effect, and GW lensing. He pointed out that there might be new cosmological constraints from cross-correlating GW event properties with CMB properties, like the CMB lensing map. Clever! And possibly big, in the mid-term to long-term future.

Doyeon Avery Kim (Columbia) is building spectral-spatial models of the all-sky fields or maps that act as CMB foregrounds. She is doing this by interpolating in spatial and spectral directions the (necessarily incomplete, different sky coverage, different angular resolutions) information from many large-angular-scale surveys. This is also very much related to my (vapor-ware) latent-variable model approach here, and is looking like it is delivering exciting results.

2018-09-13

chatting

I spoke briefly with Chris Ick (NYU) about quasi-periodic oscillations in Solar flares, Megan Bedell (Flatiron) about telluric lines in stars observed with HARPS, and Adrian Price-Whelan (Princeton) about finding overdensities in the halo in Gaia DR2 data. With Ick we discussed whether to use the Bayesian evidence or a parameter estimate to compare nested models. My loyal reader knows which side I was on! With Bedell we discussed how we might verify that our telluric model is good, using line covariances. With Price-Whelan we discussed how to estimate local overdensity in both position and proper motion that would be maximally sensitive to streams and the like.

2018-09-12

dust mapping; information theory and orbits

In Stars Meeting today, visitors Greg Green (KIPAC) and Richard Teague (Michigan) both talked about mapping dust. Teague is working at protoplanetary-disk scale (using velocity maps to find planets), while Green is working at Milky Way scale (making 3-d extinction maps). Teague is working with Foreman-Mackey (Flatiron) to get better velocity maps out of ALMA data and they are getting good success with one of my favorite tricks: Fit the peak with a quadratic. We have shown, in astrometric contexts, that this saturates information-theoretic bounds. They have gorgeous maps!

Green is trying to apply more useful spatial priors to the dust maps he has made of the Milky Way, which are (currently) independently sampled in pixels. He is resampling the pixels, using neighbor information to regularize or as a prior. His method is slow, but a lot faster than using a fully general Gaussian Process prior. And it appears to be a good approximation thereto. Certainly the maps look better!

I presented my project to figure out orbits from chemistry. There was good discussion. Spergel (Flatiron) opined that I would do no better than Jeans modeling if I did the Jeans modeling conditioned on chemistry. I am sure that's wrong! But I have to demonstrate it with a good information-theoretic argument.