long-form writing

I'm spending some time over the break thinking about possible long-form writing projects. I have an Atlas of Galaxies to finish, and I have ideas about possible books on introductory mechanics, and ideas about something on the practice and deep beliefs of scientists. And statistics and data analysis, of course! I kicked those around and wrote a little in a possible mechanics preface.


scientific priorities

I spent a piece of the morning exhaustively going through short-term priorities with Bedell (Flatiron). We discussed strategy given her stage. She has enough projects to last a decade! I guess we all do, but it is still amazing when we list them. We decided to focus on things that make direct use of the technologies we have built and not particularly build new technology for a bit. We also decided to submit the wobble paper right after the break.

After this, we segued into a conversation about the (badly named) Rossiter-McLaughlin effect with Luger (Flatiron) and Beale (Flatiron). The effect is the effective change in a star's radial velocity as a planet transits its surface, since it is rotating and has a spatial gradient in surface RV. We discussed what is involved in modeling this more accurately than is currently done. There were some philosophical issues coming up around flux conservation, limb darkening, and continuum normalization. All hard issues!

At the end of the day I got in a short quality conversation (over wine) with Alex Barnett (Flatiron) so I could pre-flash him the correlation-function and power-spectrum problems that Storey-Fisher (NYU) and I will bring him in January. He agreed that we are going to effectively unify fourier-space and real-space approaches when we make them all more efficient and more accurate. So excited about a winter of clustering!


almost nothing

My only research today was a short conversation with Bedell (Flatiron) about finishing up our paper on wobble.


office hours

It is exam week here, so I spent my whole day holding marathon office hours. That was fun! But not research.



I was out sick today. The only thing I did was think a bit about the project code-named myspace in which Adrian Price-Whelan (Princeton) and I warp phase-space coordinate systems to sharpen up velocity-space structure in Gaia data in the local disk. This looks like it works and maybe will provide insights about where the structure is coming from.


writing projects

I spent a bit of research time on the weekend on writing projects. In one, I am writing about the algorithmic observing strategies that involve sensible objectives, adaptation to what's known previously, look-ahead to the future, and a discount rate. The idea is that observing decisions should be made algorithmically but also just-in-time. And perhaps simply and interpretably, which is even harder.

In another writing project—which is perhaps not reslly research by my strict rules—I am trying to set down my thoughts about the moderation of the submissions to arXiv. Why? Because this blew up this week and I didn't agree with a lot of the things that people were saying, on all sides. Of course if I really write something good, will the arXiv accept it? I think it will get rejected by moderation!


random catalogs are dumb

Kate Storey-Fisher (NYU) and I continue to discuss tools for cosmology. She is working on a new estimator for the correlation function, which is fun! But we were tipped off a month or two ago by Tom Abel (KIPAC) about the point that a non-adaptive Poisson random catalog is pretty much known to be the worst possible way to integrate a function. That is, the random catalogs used for correlation-function estimators are almost certainly the dead-wrong methods. And this also connects to the comments we got by Roman Scoccimarro (NYU) this week about the point that the random–random term in the correlation-function estimators being used in eBOSS take weeks to execute! We discussed this in more detail today, and I made a mental note to check in with our math colleagues in January.


dynamics for integration

My research highlight for the day was a visit by Eric Vanden-Eijnden (NYU) in Math. He showed me (among other things) some new MCMC methods that can make use of dynamics to sample difficult probability distributions or compute fully marginalized likelihoods (evidences). They involve dissipative dynamics, not fully Hamiltonian dynamics like Hamiltonian methods. We resolved to try them out on the binary-star problem that we solved with The Joker, because these problems are multi-modal but in ways we understand.


actions, planet spectroscopy, dust

Discussions continued at Flatiron about Galactic dynamics and actions. We laid out uses for actions and then discussed more results from Beane (Flatiron) on the inconsistency of actions when you have wrong coordinate systems or potential.

Stars meeting featured various interesting discussions. But during a discussion led by Kreidberg (Harvard) about temperature-mapping hot rocky planets, I had an idea: We could use the strong absorption lines in stellar spectra to increase the planet-to-star brightness ratio. If we have full-phase coverage with high-resolution spectroscopy, we can look for the hot planet to “fill in” some of the absorption lines at full phases, and the amount it fills in for lines at different wavelengths would tell you the temperature (or low-resolution spectrum) of the planet! I want to do this with our HARPS data and our wobble pipeline!

In the afternoon, Boris Leistedt (NYU) and I made a plan with David Blei (Columbia) and Andrew Miller (Columbia) to build our 3-d dust model out of dust measurements. There are many problems to solve! But we are starting by assuming that Leistedt's data-driven dust measurements are correct and have Gaussian noise, the stellar positions are well known, and the dust field can be represented by a Gaussian process. In terms of challenges, we are starting by working on the scaling problem: How to make things run on millions or hundreds of millions of stars at a time? One dispute we had is about what line-of-sight integral of the dust corresponds to the extinction?


are jets beamed? correlation function slowness

Today Kate Alexander (Harvard) gave the Astro Seminar. She talked about the observational properties of jets across wavelength but especially in the radio. And unresolved jets, understood through their spectral energy distributions. One point which came up is that there does still seem to be a beaming puzzle: The models of the observations imply high beaming factors, but off-axis examples are very hard to find. So is the model ruled out? MacFadyen (NYU) implied yes, even though he is one of the principal authors of the theories! I think this is a super-important area for multi-messenger and time-domain astrophysics.

Before lunch, Kate Storey-Fisher (NYU) and I had an absolutely great discussion with Roman Scoccimarro (NYU) about our correlation function estimator. He started off very skeptical and ended up a huge fan, which was fun to see, because I am pretty stoked about it! But then he said something off-topic but super-interesting: He has a standard experience on huge projects of the following form: While the correlation-function team is waiting for the data center to compute the correlation-function estimator (which involves an enormous pair-count operation in data and (much more importantly) random catalogs), he computes the power spectrum for the same data sample on his laptop! And yet the correlation function and the power spectrum are (in principle) the same information! What gives?

The answer—which I have to say I haven't fully figured out yet—is in part that the standard power-spectrum estimation doesn't consider explicitly the off-diagonal (k not equal to k-prime) mode cross-correlations, and in part that the standard power-spectrum estimation assumes that the window function is simple enough that a random catalog is not necessary. Those are huge approximations! However, if they are good enough for the power spectrum on baryon-acoustic scales, then they must be good enough for the correlation function on those same scales and maybe we can build a far, far faster estimator?


cross-correlations of maps.

It is a low-research time of year! But a highlight today was the PhD candidacy exam of Shengqi Yang (NYU), who is working with Anthony Pullen (NYU) on new data analysis techniques for cosmology. She is doing a number of things, but the part I am most interested in is manipulation of combinations of cross- and auto-correlation functions to determine other cross- and auto-correlation functions. These combinations are very simple and valuable! And you can combine observed and theoretical functions as you see fit. I got an argument started in the room about the conditions under which these relationships are exact, or true in the limit, or approximations. I would like to understand that better!


actions useful? and stars meeting

In the morning I met with Gus Beane (Flatiron) to discuss his use and understanding of actions in empirical work on the Milky Way, following up on the blow-up of last week. We discussed the point that small issues with the Galactocentric coordinate system could totally mess any action calculation, even if the actions make sense, and the point that there are many possible galaxies we might live in for which the actions don't even make sense. We vowed to move the conversation / argument going on at Flatiron towards the question of what we are trying to achieve with these calculations. Are they just orbit labels? Or are they quasi-invariants? Or are we using them to match up stars that are far apart?

Stars Meeting was a whirlwind of interesting things! Kim Bott (UW) told us about polarimetry for exoplanet discovery and characterization. Evan Bauer (UCSB) told us about accretion signatures on white dwarfs and how they have probably been mis-interpreted (but in a way that makes accretion more important!). And Suroor Gandhi (NYU) told us about relationships between ages, abundances, and actions in the Milky Way that suggest that there are interesting relationships at all ages, and at all abundances. Hard to summarize, but there are lots of things to think about in there.


#AstroData2020s, day 1

Today was the first day of a meeting hosted by the NASA IPAC (home of IRSA and Spitzer among other important projects) to start the discussion of the response of the astronomical data archives to the US Astrophysics Decadal Survey. Not everyone agreed on the point of the meeting, but I think it is to create talking points that connect to archives, but which could be incorporated into community science white papers. These white papers are due in February.

There were several highlights for me at the meeting today. One was Hillenbrand (Caltech) summarizing the white paper process from last decade, and giving advice for white-paper submitters. She emphasized that the white paper text should be cut-and-paste ready for inclusion in the final report. That is, it isn't like a proposal to be approved; it is like a community contribution to the writing of the report. And she emphasized that it doesn't make sense to make points in white papers that will be obvious to the committee!

One of the technical concepts that was discussed today by archives was that of science platforms, in which archives might provide compute resources or other scientific facilities to their users: The idea is to bring the code and the analysis to the data, since the alternatives might be too expensive. But (as I brought up in discussion) then that gets into the space of archives making decisions about what science they do and don't support, which might conflict with peer review, or put scientific projects under various kinds of double jeopardy. And it might mean that projects like LSST, which are doing related things, might end up interfering in unintended ways with the astronomical community and its scientific priorities. These are interesting issues to keep track of.


Spitzer oversight

Today was the NASA Spitzer Science Center Oversight Committee meeting. As usual, the day was filled with insights about how you run a large integrated engineering and science operation. The project team particularly called out Lockheed Martin and the Deep Space Network (separately) for praise in how they have both supported the mission. One of the interesting stories of the day is how the mission depends on legacy hardware, including ground-based computers that are way out of date! The project has no programmers, so it has to keep its software running on late-1990s computer hardware!


simple, explainable algorithms

I spent the day in Princeton, visiting both the Department of Astrophysical Sciences and the Center for Statistics and Machine Learning. I had many great conversations, and I gave a talk that is about data-driven methods and their relationships to what we think of as being machine learning.

One highlight was a conversation with Ryan Adams (Princeton), who has brought some serious probabilistic methods to astronomy but is himself a computer scientist and statistician. He and I discussed the issues of algorithmic real-time, adaptive target selection for astronomical projects (especially EPRV-like). There is the full Bayesian decision thing, which I know how to do but which is expensive. But there is the idea that the decisions should be simple, and explainable. He pointed out that this is a huge area of research right now, and it connects to many things, especially in ethical situations: We want simple, explainable decisions! That's an interesting idea to bring into astrophysics.

There were many other great conversations, ranging across polarimetry of exoplanets, star shades and coronographs, neuroscience, stellar surface mapping, photometric redshifts, and astronomical catalog making.


phase-space volume; Oort dynamics

The research highlights of the day were a call with Matt Buckley (Rutgers) and a Physics Colloquium by Scott Tremaine (IAS). In the former, we discussed the design of a first paper about Buckley's work on measuring phase-space volumes of bound and disrupting dynamical objects in the Milky Way halo. He has some great results! But we don't understand the sensitivities to noise yet, or the in-practice issues of making robust measurements. And I mean “robust” here in the statistical inference sense.

In the latter, Tremaine answered most of the questions we formulated a few weeks ago about the origin and properties of the Oort cloud. My loyal reader may know that I am suspicious about many of the things that are said about the Oort cloud, but Tremaine showed numerical results that seem to back up most of the lore. He then switched to talking about interstellar asteroid 'Oumuamua. Aside from the usual loose talk of aliens, Tremaine said something remarkable: The pre-Solar-System velocity vector of the object is very close to current consensus on the Local Standard of Rest (something else of which I doubt the existence). Tremaine noted that it might conceivably represent an amazingly accurate measurement of the LSR! Too early to tell yet.


actions or orbit labels?

In our semi-regular Dynamics Meeting at Flatiron, Gus Beane (Flatiron) showed some simulations in a realistic potential that suggest that the actions we compute for the Milky Way might not be even close to being invariant. That caused a big fight to break out! Some were arguing that the actions shouldn't be called actions but rather “orbit labels”. Some were arguing that the effects he is seeing are exactly what you expect. Some argued that the variations he was seeing are way too large and there must be a bug! And some were arguing that it might be problems as fundamental as the reference frame: If you have the axes or origin slightly wrong, you compute everything wrong! But all these things are possibly happening in our analyses of the Milky Way, so Beane stepped on a very important issue for everything that is going on now in Milky Way dynamics with Gaia. As my loyal reader knows, I don't like action space: There aren't necessarily actions at all, and anything you compute using them might be extremely misleading, especially if you implicitly assume that they are invariant, or close.

In another conversation, Katie Breivik (CITA), Adrian Price-Whelan (Princeton), and I discussed a possible Decadal Survey science white paper about binary stars, population synthesis, and interdisciplinarity within astrophysics. That's a good project, but it requires a group effort from a large community; how to organize that?


writing and listening

As per usual, Tuesdays are low-research days. But I did get in some time with Bonaca (Harvard) about her next projects on Milky Way and tidal streams. And some time with Leistedt (NYU) about our possible pedagogical paper on hierarchical inference and models. We got through half of a paper outline.

The Astro Seminar was Jo Dunkley (Princeton) talking about Simons Observatory. It is a beautiful project and many things will come of it. But it is not obvious to me that it will be possible to see the gravitational radiation from inflation. Obviously we should look, though!


no short-cuts for planet searching?

I came in early to do some writing with Megan Bedell (Flatiron). We are so close to having her paper on the wobble software for data-driven modeling of HARPS spectra done! The results are just incredible and amazing. Can't wait. I still have to-do items there.

Late in the day I bounced off of Dan Foreman-Mackey (Flatiron) the idea that Price-Whelan (Princeton) and I had about making an effective noise model that would permit us to search for medium-period planets in radial-velocity data without explicitly modeling and marginalizing out the short-period planets. He simultaneously thought the idea was wrong (duh) but that it is an important thing to be thinking about: How do we make decisions about promising stellar targets for RV observations without keeping fully updated posterior or likelihood information about their full multi-planet planetary systems?


bits of non-traditional scientific writing.

I spent my research time today working in various long-term writing projects. I wrote a summary of all the things we could or should be doing to improve extreme-precision radial-velocity measurements from the software side. It is a long list! I sent it to various friendlies for comment; I am trying to prioritize my work in this area.

I also edited some documents in which I am brain-storming ideas about 2020 Decadal-Survey white papers. There are science white papers due in January and more project-specific ones due later in the year. On science, I am kicking the tires on something about binary star populations, since they cut across almost all areas of astrophysics. On projects, I am thinking about things involving the (otherwise moth-balled) LSST hardware and also EPRV spectrographs.

And I worked out a strategy for making very challenging (adversarial, almost) time-variable spectroscopy data, and a strategy for beating it in a data analysis. This is a great unsolved problem in astronomical data analysis: Precision radial-velocity measurement in the face of spectral variability!


information, interpretability, emulation, isocurvature

I started my day with a call to Adrian Price-Whelan (Princeton) to discuss my ideas around making decisions for observing using the EFDIG, which is my new acronym for expected future-discounted information gain. I want to make a real-time decision-making system based on this, but I don't want to spend tons of compute, for pragmatic reasons about comprehensibility and model-ability. I might be in trouble. In the middle of the call we had a funny idea about effectively marginalizing out short-period planets in a search for year-ish-period planets.

At lunch I discussed machine learning with Gabriella Contardo (Flatiron) and had a couple of duh moments: She pointed out that if you have a function or computation or simulation that can quickly go one way (from input to output) but cannot or cannot quickly go the other way (from output to input), then you have an ideal case for machine learning. Just generate data and train a model to go the other way. Duh! Machine learning to invert functions!

She pointed out that if you are trying to model a function that isn't one-to-one or many-to-one but rather many-to-many or one-to-many, in some sense, then vanilla machine-learning approaches won't be good: They are deterministic and single-valued, once trained. Vanilla methods, anyway. That was another duh for me. And yet I hadn't had these points emphasized so clearly and so sensibly.

Our conversation ventured into interpretability-land. I am all for generalizability—that's my jam—but recently I have been giving up on interpretability. Contardo isn't: Her feeling is that if she does her current project right (her project is to determine which light curves of stars in Kepler are in fact light curves of unresolved binaries), the features she obtains for light curves will be interpretable. Interesting!

Somewhere in the day I also had a nice chat with Stephen Feeney (Flatiron) about isocurvature perturbations in the initial conditions of the Universe and what they might do to Hubble Constant estimation. It looks like they might cause trouble! I wondered aloud about maximally adversarial isocurvature contributions.


hierarchical models

Tuesdays are low-research days, because teaching. But Boris Leistedt (NYU) made a proposal a few weeks ago that we write a pedagogical document on hierarchical modeling. He is thinking: in the style of our Data Analysis Recipes contributions. So we spent our meeting today brainstorming the content for the note. There definitely is a contribution for us to make. Key is to have some toy problems that map onto things astronomers care about but also illustrate the relevant methods and issues, and also are comprehensible to others in the natural sciences. We discussed fitting a line, fitting a mixture of lines, mixture-of-Gaussian models, and foreground-background models. I also like calibration models that have good causal structure.

I’m not sure if it counts as research, but I was involved today also in contract discussions for the Terra Hunting Experiment with HARPS3. We are trying to structure the Flatiron and Princeton buy-ins so that they help the project but also are executable by our institutions. That’s way above my pay grade but somehow I have to partially navigate it.


the measure problem

Not a high-research day. Conversations with Bedell (Flatiron) about papers we need to finish, and a plan to sprint in one week's time. Also a great Brown-Bag talk by Matt Kleban (NYU) about the many vacuum states available in a string-theory-like model with lots of axions. There will always be lots of volume that has a cosmological constant near our observed value, no matter that it is so damned low. But figuring out what this means for generic observers is hard, both because of the anthropic issues, but also because there is no measure for the spacetime. That's crazy and led to yet another discussion of the measure problem. It always surprises me that this is an unsolved problem.


Gaia and the halo

Today was the Big Apple Colloquium, with Amina Helmi (Groningen) presenting on Gaia results relating to Milky Way dynamics. I don't love the chemical arguments she gives that the observed halo stars must have come from a single progenitor; since we know that nucleosynthesis is pretty low-dimensional, all merging progenitors might lie on very similar chemical tracks. But the data do show great evidence of merging and non-equilibrium dynamics in the disk and halo!


variational inference for dust

As always, Wednesdays are research-filled days. At the beginning of the day, Bedell (Flatiron) reminded me that I have a boat-load of writing to do on our joint projects, and the urgency is high: We will have a submittable paper by next week if all goes well. So I printed out some old, old text to revise.

One of the research highlights of the day was a joint conversation with Anderson (Flatiron), Leistedt (NYU), and David Blei (Columbia), about building a big, self-consistent, probabilistically justifiable, statistically isotropic model of the Milky Way dust. This is a project we have been kicking around for years now, but never really got serious about. My thoughts were sharpened in the last two summers at MPIA by Sara Rezaei Kh (MPIA), with whom I did a bit of Gaussian process work. But that isn't really tractable for large data, and can't really deal with non-trivial likelihoods (like distance uncertainties, and covariant distance and extinction uncertainties). It looks like we might try black-box variational inference on the problem. This won't give exact inferences but it might be able to handle the size and complexity of the data we have.


initial conditions

In the astrophysics seminar today Tristan Smith (Swarthmore) convinced us, perhaps incidentally to his main point, that the Hubble Constant controversy could be resolved if the baryon acoustic feature (or peaks in the CMB) is moved relative to the vanilla CDM prediction. And it doesn't have to be moved far! So maybe there is just a bit of initial-conditions manipulation to make that happen, and then everything is in agreement! Interesting take on things.

My only other significant research for the day was a discussion with Kate Storey-Fisher (NYU) about the outline of our paper about correlation-function estimation, and a chat with Boris Leistedt (NYU) about a possible pedagogical piece on hierarchical modeling and graphical models.


R-M effect; failure

The day started with a great discussion with Luger (Flatiron) and Bedell (Flatiron) about the Rossiter–McLaughlin effect, which is the apparent velocity shift as a planet transits a rotating star. We discussed how this effect really is different from a radial-velocity shift; it is a line-shape change, and how we might model that within an extension to the wobble framework. That's a great idea and possibly an important contribution. The R–M effect has been important in exoplanets.

Late in the day, I experienced complete failure to produce a grant proposal. It was effectively due late last week, so I really had to produce today, but under the gun I failed. That was a hard blow! I love my job, but sometimes I find it to be difficult.



I spent the day today at CITA, which is my childhood home: My first-ever scientific paper was written here (when I was an undergraduate researcher) with Scott Tremaine (now IAS) and Gerry Quinlan. At the CITA weekly grass-roots discussion of matters cosmological, Deyan Mihaylov (Cambridge) spoke about gravitational-wave detection with Gaia. He made an amazing point (which, like most amazing points, is obvious in retrospect): The GW signature in Gaia has an earth term but no “pulsar term”, in the language of pulsar timing. That is, it only depends on local metric perturbations! That is extremely good for scaling and precision.

In that same forum, I spoke for the first time ever about the correlation function estimators I have been developing with Kate Storey-Fisher (NYU). I spoke extemporaneously—it's a discussion forum—but I realized that we do have a great story to tell. It includes context from the Landy-Szalay estimator world and context from the linear-fitting world. Plus some information theory for spice! It is a great audience at CITA and they helped me sharpen my case well.

A highlight of a long day of conversations was a chat with Katie Breivik (CITA) about binary population synthesis. She is interested in predicting gravitational-wave sources. But the issues are general. We discussed what aspects of the theory are most weak, and where we might be able to patch in a data-driven replacement. That conversation is only just started, but it's something I want to bring home to NYC and think more about.


listening at Toronto Physics

I spent the day today at CITA and UofT Physics in Toronto. The CITA Seminar was given by Alexander van Engelen (CITA), who spoke about the things we can learn from the CMB in the near future. He emphasized that there are still interesting things to learn about the primary CMB, which violates some beliefs I held prior to the talk! But he also put a lot of emphasis on the lensing or convergence map, which can be combined with other tracers to do a lot of science.

I had so many great conversations and discussions, too many to describe! But some highlights included the following: I chatted with Patrick Breysse (CITA) about testing cross-correlations and self-calibration for line-intensity mapping experiments with toy models. He has some nice ideas there. I chatted with visitor Deyan Mihaylov (Cambridge) about the possibility that Gaia might detect gravitational radiation! Bart Netterfield (Toronto) talked about very precisely pointed balloon-born optical telescope experiments. And Chris Thompson (CITA) had all sorts of crazy ideas about what might cause the fast radio bursts. His principal ideas involve cosmic strings and black holes!

I gave the UofT Physics Colloquium. I spoke about how Gaia and other kinematic surveys can measure the dark matter. I talked about the results that Ana Bonaca (Harvard) and Adrian Price-Whelan (Princeton) and Charlie Conroy (Harvard) and I will have on the arXiv on Monday!



I spent the day today at University of Waterloo, where I gave the Astro seminar. It was a great day! I prepared my talk on the bus from Toronto, which wasn't good from a nausea perspective! But I really find I give a better talk if I remake it from scratch before I give it. That is, old talks get stale, at least for me. So I have a brand-new talk about machine learning and data-driven models and criticisms thereof.

Before my talk, in the astro-ph discussion, and after my talk, with James Taylor (Waterloo) and with Mike Hudson (Waterloo), there were good ideas flowing about how to use galaxy morphologies and in particular galaxy granularity to determine galaxy distances and maybe also gravitational-lensing shear. This relates to photometric redshifts and also my ideas about making adversarial galaxies that don't reveal their shear via their ellipticities (or not strongly). Many other great conversations; too many to mention!

My visit ended with quality time with Dustin Lang (Perimeter), who always makes my day.


finding moons indirectly

The only research I personally did today was stressing out about the talks at Waterloo and Toronto that I haven't even started to prepare! And that isn't research either. However, Apurva Oza (Bern) gave a nice talk about sodium and potassium in the Solar System and in extra-solar systems. He pointed out that the outgassing / volcanism of Io means that there is a gas ring around Jupiter that might be visible in transit spectroscopy, and might permit the detection of moons even when there aren't visible moon transits. Or might confuse transit spectroscopy. In some cases the ring is partial and follows the moon, so it would lead to a predictable time-domain spectroscopic signal, in principle. Worth a search!


Finding planets near resonances

At breakfast, I had a long discussion with Megan Bedell (Flatiron) about what things should go into the discussion part of our wobble paper, in terms of the limitations and extensions of the model. We came up with quite a list! But I love any project that opens new paths.

I also had a long discussion with Rodrigo Luger (Flatiron) about searching for planets in Kepler data that are in 1:1 resonances. He is focused on the point that they will (in general) have large transit-timing variations. I would call these librations around their exact resonances. If we model these librations as approximately sinusoidal, the search space is tractable: A fixed period plus a TTV with some amplitude and period. That's a good idea! And Luger points out that there are strong priors on the amplitudes and periods of the librations. Of course there will be systems that even this setup will miss; there was a dispute between us and Foreman-Mackey (Flatiron) about what fraction. He argued for using a completely stochastic model for the librations. He might be right; but baby steps!

All this motivated by the possible discovery of a 1:1 by Mitchell Karmen (NYU). Of course the actual system he found almost certainly isn't a 1:1, we now think: It has many signs of “just” being an incredibly eccentric eclipsing binary with dilution from a third star.



I spent a good part of the day working through Fourier transform issues with Kate Storey-Fisher (NYU). We started out confused both about what the transform should be giving us and how to run the code correctly. So we switched to Gaussian functions for which we know the correct answer and at least understood the interface. Now to understand the correlation function!

In group meeting, two threads came together today. Bedell (Flatiron) asked for feedback about how to present wobble results for maximum impact. And Luger (Flatiron) took some of those wobble radial velocities and fit them with a model for the Rossiter-McLaughlin effect made by punking his own STARRY code for modeling photometric transits.


not much!

In a day obliterated by letters of recommendation, Humzah Kiani and I discussed extinctions in the Gaia footprint, and Kate Storey-Fisher and I discussed the Gaussian random fields we have been trying to simulate.


Trojans, Oort Cloud, greedy algorithm paradox

Early in the day, undergraduate Mitchell Karmen (NYU) blew me away by showing a possible Trojan satellite hiding in the Kepler false-positive bin. It probably has some other explanation, but damn it's exciting! I discussed this with Rodrigo Luger (Flatiron) who dampened my excitement (for good reasons).

At stars meeting, Michele Bannister (Belfast) spoke about ways in which we might use the properties of the outer Solar System (and especially the things past the Kuiper Belt and including the Oort Cloud) to constrain the birth environment and subsequent dynamical environment of the Sun at formation. It appears that these structures could be created early and are strongly modified by nearby stars and close passages. One implication is that different stars should have very different Oort Clouds. That's a great prediction; now how to test it?

Mike Blanton (NYU) showed some very cool results from the work being done on SDSS-V robot fiber positioners. As you might guess, the positioning of fibers on a focal plane by robot arms that can collide is an intractable problem in general—it's like traveling salesman. But you might also know that most NP problems are pretty well-served by sensible greedy algorithms. That is, you can usually do something akin to the simplest thing and still succeed most of the time.

Blanton showed the interesting thing (worked out by Conor Sayres at UW) that if they do a greedy algorithm to take the robot arms from the "home" state to the configuration they want, it is very slow and hard, and it still fails in many cases. But if they do the exact same greedy algorithm the other way—that is, to take the arms from the configuration they want back to the home state—it works fine! So they do that and then run the result backwards!

Crazy talk. And cool. And worthy of a lot more thought. And something about entropy? After all, the home state is like a crystal.



Today Michele Bannister (Belfast) gave a great talk about the outer Solar System. She was very clear that her observations do not rule out in any way the existence of Planet 9. But they do discredit every single shred of evidence in its favor! And she gave many other mechanisms that could explain the same data. That is, there really doesn't seem to be any reason to believe that there is an unknown planet hanging out in the outer Solar System. Lots of what she said relies on the following theoretical observation: When a planetesimal is perturbed by a massive body on an orbit interior to its perihelion, it tends to preserve its perihelion but change its semi-major axis. And the same but opposite when the massive body is outside it's aphelion. All planetesimal migration scenarios must respect these constraints.

Before that, Kate Storey-Fisher (NYU) and I had a long conversation in which we re-discovered our confusions about the differences between the continuous Fourier transform (which never exists in any real-data context) and the discrete Fourier transform (which is what's appropriate when the data are treated as a patch of a periodic function. We got confused and then un-confused, but I am still somewhat confused!


asteroids and dark-matter halos

Today Michele Bannister (Belfast) showed up. We spent time talking about how asteroids are characterized in time-domain imaging surveys. The idea is to make a fictitious absolute magnitude, which is what the asteroid would look like if it was simultaneously 1 AU from the Sun and 1 AU from the Earth, and observed with the Sun and Earth both getting it from the same angle. That's not real! We discussed how we might improve that situation.

I also spoke with Lauren Anderson (Flatiron) about how we might reduce the dimensionality of cosmological simulations of galaxies to a small parameterization of what's possible. The idea is to get a not-too-complex parameterization of the triaxiality of galaxy dark-matter halos and their dependences on time. I have a vision here, but it isn't clear it is possible to execute. We discussed the issues of using existing simulations or running our own.


simplest possible model

Today was a fun day at Flatiron. I saw Didier Queloz (Cambridge) and Karin Öberg (Harvard), who are in town for a Simons program on Origins of Life. With Queloz we discussed target selection for the Terra Hunting Experiment and with Öberg we discussed data-driven methods for finding planets embedded in proto-planetary disks observed by ALMA.

Today was the last day of the visit by Heather Knutson (Caltech). We decided to implement the simplest possible version of the data-driven models for planet and brown-dwarf spectroscopy that we have been talking about all week. This would mean one spectral template per object, and one telluric template per night. This might not be good enough, but it is worth a shot, and might teach us a lot. The idea is to structure the model very much like Bedell's wobble model.


target selection for THE@INT

This morning, Megan Bedell (Flatiron) and I joined a telecon relating to target selection for the new Terra Hunting Experiment with HARPS3 on the Isaac Newton Telescope. It was a great conversation; we are new to this project; we learned that the project has some good and sensible ideas:

One is that the target list must be at least twice as large as what they can handle, so that changes can be made on the fly during the project's 10-year baseline. Another is that target changes must be made algorithmically, to preserve the statistical value of the sample. Another is that the strategy cannot be as dumb as we might like because discovery rate is a driver of policy. And another is that the observing decisions will be made just-in-time, on the fly, at the telescope. Again, algorithmically. My loyal reader knows that I Love These Rules. Now to play!


Patel, Knutson, Rey

Ekta Patel (Arizona), former NYU undergraduate researcher extraordinaire, showed up at Flatiron today. We spoke about all the new ideas around making inferences about the Milky Way and it's formation and dynamics, given that we can't treat the Galaxy as a time-independent, symmetric, steady-state object (and we really can't, especially in the halo). Right now all the methods are either based on very questionable assumptions (like when can a time-dependent system be treated as a small perturbation away from a time-independent system, etc) or on super-brute-force methods (like find, among billions of simulated galaxies, a few that look like what we see!). Patel has been a pioneer in the latter, but there is lots more to do.

At Stars Meeting, Patel told us about possible strong selection effects in the MW-satellite game, which might mean that we are missing many! Missing satellite non-problem? Martin Rey (UCL) told us about how you might answer semantically causal questions about galaxy evolution with quantifiable and sensible adjustments to initial conditions in simulations. That got me all philosophical about causality in a unitary universe! And Heather Knutson (Caltech) told us about metallicity effects in the spectroscopy of directly detected exoplanets; it turns out her study is limited by the quality of the stellar metallicities. Maybe Birky (UCSD) and I could help with that?

All this after an early-morning discussion with Knutson about building a data-driven model with good causal structure to explain her exoplanet spectra. I argued that once you have the causal structure in place, good inferences become optimization (or sampling) problems. I hope this is true!


reducing and assembling spectroscopic data

As per usual, Tuesdays are low-research days! But I did get in some time with Heather Knutson (Caltech) on her spectra of directly imaged exoplanets and brown dwarfs. We had a call with her team in California to discuss what's involved in getting together all the data we'd ideally like to have assembled. Of course it takes time: Spectroscopists expect that a night of good data might take many days to reduce and get into useable form. We discussed a bit how we might make all that more efficient. But making that efficient is not our priority, at least not right now. Eventually!


Knutson, calibration, hot stars

Heather Knutson (Caltech) arrived for a week of hacking on exoplanet and brown-dwarf spectroscopy. She has a number of things she has brought for our consideration. But the one that seems to be sticking is the inadequacy of her theory-driven or physical tellurics model. It has systematic residuals. We are going to explore options for tweaking the model using a data-driven fit to the residuals. This is a structure that I would like to try also for The Cannon: Instead of making a data-driven model for the stellar spectra, we could make a data-driven model for the residuals of the spectra away from best-fit models. And the parameters for the physics-driven model and the data-driven model could be tied together (or not) in various clever ways. So much idea.

At lunch, Anthony Pullen (NYU) gave a great talk about foreground mitigation in line-intensity mapping experiments. He went through all the kinds of auto-correlations, cross-correlations, and de-correlations that can be done to remove or mitigate foregrounds. The talk reminded me of many conversations I have had over my life about self-calibration, which led me to think about whether we could replace the cross-correlation parts of his model with a kind of self-calibration. Worth thinking about!

Late in the day, Benjamin Pope (NYU) and I came up with a good plan for looking at hot stars in Kepler. We could look at modeling them as a mixture of asteroseismic modes, spacecraft systematics, and planets. And then probably find nothing! But find nothing better than it has been found before. I like that kind of project.


ready to submit!

I worked on the weekend to finish my paper with Eilers (MPIA) and Rix (MPIA). It is ready to submit! And yet I can't push my changes properly to GitHub because they are (in a very rare moment) down! I made some compromises in finishing up this paper; I can only justify them by promising myself I will address the final issues while the referee considers the manuscript.


target selection; rock and metal

At Flatiron we have purchased a share in the Terra Hunting Experiment, which will be a big, long-term radial-velocity monitoring program with HARPS3. Today Megan Bedell (Flatiron) and I had a conversation about target selection for that survey. There are many choices that could be made in target selection that could make populations or astrophysics inferences very difficult or even impossible later. These conversations remind me of the great and hard work that went in to target selection in the SDSS family of surveys.

The day ended with a great talk by Leslie Rogers (Chicago) about the things that set planet sizes (as a function of mass). She always phrases her results in terms of what isn't rocky, because of the one-sided-ness of some or most of the composition-related observational uncertainties, but it sure looks to my eyes like the smallest planets are rock and metal, like the Earth. She has one extremely good case, which is orbiting so close to its host star that tidal-disruption arguments come in to play! She also was optimistic that transit-timing information might be informative in the near future. There were jokes about water planets and soda-water planets, because many planets that are rich in water are also expected to be very rich in CO2.


convexity in machine learning

Thursdays are low-research! But there was a great NYU Physics Colloquium at the end of the day by Eric Vanden-Eijnden (NYU) about the mathematical properties of neural networks. I would say “deep learning” but in fact the networks that are most amenable to mathematical analysis are actually shallow and wide.

I am not sure I fully understood EVE's talk, but if I did, he can show the following: Although the optimization of the network (which is a shallow but wide fully connected logistic network, maybe) is not in any sense convex, and although the model is non-identifiable, with certain (or any?) convex loss function, and with enough data (maybe), the optimum of the loss is convex in the approximation of the model to the function it is trying to emulate.

If anything even close to this is true it is extremely important: Can an optimization be non-convex in the parameter space of a function but convex in the function space? I am sure there are trivial examples, but non-trivially? This might relate to things I have wondered about bi-linear models and related, previously.


bar, spiral structure, and interactions

Stars meeting at Flatiron was absolutely great today. Discussions by Cunningham (UCSC) who has done HST astrometry, Keck spectroscopy, and kinematic analyses of large samples of Milky Way halo stars. She is full stack! And by Brendan Brewer (Auckland) who is working on information theory (in a Bayesian context) to think about experimental design in realistic contexts. My loyal reader knows how close to my heart that is.

Also at Stars meeting, Pearson (Flatiron) and Laporte (UVic) showed models of the effects of the Sagittarius merger on the Milky Way disk. Because the disk is such a sensitive dynamical “antenna”, it should show evidence of this encounter. In the simulations, it appears that the encounter is capable of raising the bar and spiral structure that is very similar to what is observed. Like very similar. This is incredibly exciting: If this pans out, it opens up use of bars and spirals to find or time or weigh galaxy encounters and interactions. Maybe even with dark-matter substructures! Super exciting.

Before all that, Sinan Deger showed me nice results on galaxy morphologies as a function of environment and location around clusters, and Ari Pakman (Columbia) gave a beautiful math-filled talk about Hamiltonian Monte Carlo. He had a very nice, extremely simple proof and picture for why HMC works.


Is the Milky Way halo really a thing?

A very low-research day was saved by Suroor Gandhi (NYU) who showed me work she is doing with Melissa Ness (Columbia) on stellar chemistry and kinematics. We discussed the question of whether the Milky Way stellar halo really looks like a distinct kinematic and chemical component (as it should!) or whether it just looks like some kind of continuous extension of the disk (which it should not, but does). Interesting, and how to dig deeper?


stacking residuals?

In an extremely rare event, I finished a paper! Well, a second draft anyway. The plan is to submit next week. This is my paper with Eilers (MPIA) and Rix (MPIA) on spectrophotometric distances.

Other research today included a conversation with Bedell (Flatiron) about how to look at telluric variability in the wobble residuals. In general the residuals are informative! More thoughts about that happened late in the day with Ben Pope (NYU) who had ideas about stacking the wobble residuals in the planet or companion rest frame to find interesting things for different kinds of companions.

And I had a long conversation with Anderson (Flatiron) about applying variational inference to dust or extinction estimates in the Milky Way. We are making a proposal to David Blei (Columbia) and his group to start a collaboration along these lines.


machine learning; finishing a paper

I worked a bit of the weekend. I had a great conversation with Francois Lanusse (Berkeley) about the uses and abuses of machine learning in astrophysics. We agreed on most things. He sang the praises of some of the newly available cloud services that do machine learning for you. We discussed some pie-in-sky projects.

Months ago, I promised Christina Eilers (MPIA) that when she finished her paper on her Jeans model of the Milky Way disk, I would finish my paper on spectrophotometric parallax (or distance) estimates. Well, today she finished her paper! So I went into panic mode and by the end of the day I was nearly finished. Nearly. I must get up early and finish tomorrow. If I really do finish it, it will be a rare and special thing: A first-author paper! I only write one of those every two or three years.


#DSESummit2018, day 3

In one of today's lightning talks, Chris Holdgraf (Berkeley) showed us JupyterHub and related projects, which are methods for distributing data, software, and compute to students (or members of a group) so they can transparently use a non-trivial data-science environment, through any kind of client. It is beautiful stuff, but also very interesting in its origins: It grows out of the undergraduate class Data 8 at Berkeley, which is an innovative project to teach the fundamentals of data science to all Berkeley undergrads, independent of their backgrounds. And much later, over drinks, Holdgraf explained to me lots of chaos-monkey-ish and sensible things they do at Berkeley to make sure that their code is truly and absolutely platform-neutral and vendor-independent. The intellectual content of these projects is truly impressive.

In the afternoon, I got some quality time in with Sarah Stone (UW) on our commitments to produce final products for this project around spaces. We discussed the role of ethnography, architects, and data scientists in figuring out what is and isn't working in our spaces. We also discussed what kinds of products we want to produce.

The last event of the day included a great plenary by Huppenkothen (UW) about the AstroHackWeek and related projects. She emphasized its interdisciplinarity, its values of experimentation, and above all, its commitment to broadening the fields of study and being welcoming to all. It was inspiring and enjoyable. I am extremely proud to have been a part of these projects.


#DSESummit2018, day 2

It is such a great meeting, this meeting. And I think it is because we spent a lot of time early on in this project in building community. That is, we made sure we feel like we are part of a greater whole. Learning from this, I would love to try to bring this community-first thinking to all the things I do. It requires attention!

In the middle of the day, the core team on the project met with the funding officers and we discussed the ramp-down and close-out of the grant. This has two important and very difficult aspects. The first is that we need to finish what we started: The project is to learn about how to do interdisciplinary things in the university, and to communicate successes and failures to other universities and the larger world. I have a role in that and I agreed to take on some of this final communication. The second is to take the best things we are doing in this funded project and figure out how to continue them after the funding is no longer flowing from these granting agencies. That's critical to our success at the NYU CDS. I left the meeting energized, but a bit concerned about what I need to do in the next year or so!

After tremendously interesting discussions and talks, the day ended with a brainstorming session with Richard Galvez (NYU) about possible projects that bring machine learning to the Gaia data. We worked through some simple ideas that I have been thinking about. I like the idea of modeling the Gaia data with deep learning, because even a deep network acting on such small (per-star) data will be tractable, and maybe even interpretable! We ended on optimism, but not with a final decision about what we are going to do.


#DSESummit2018, day 1

Today was the start of the annual Moore-Sloan Data Science Environments summit. I led an ice-breaker in which we split into small groups and discussed figures and data visualizations. It's a great community, so it was fun to get started. But as for research: I read and commented on text for Bedell (Flatiron) on the plane, and I worked with Richard Galvez (NYU) on designing a small project that brings machine learning to the Gaia data.


finishing papers; galaxy morphology regressions

The morning started with a conversation between Eilers (MPIA) and I in which we decided that we will finish our connected papers (first draft anyway) by Friday. I think she will make it! But will I make it? I am going to be strong. We also went through some ideas about testing the assumptions that underly our Jeans model for the Milky Way disk, and what to write about the outcomes of those tests.

Mid-day I had good conversations with Storey-Fisher (NYU) about building pseudo-simulations that make point sets with low-amplitude non-trivial power spectra. We spent an unfortunate amount of time figuring out how the numpy fft module organizes and stores fourier transform data. It isn't trivial!

In the afternoon, Elisa Chisari (Oxford) gave a nice (and pleasantly technical) talk about weak lensing, which evolved into a longer discussion about how we might get more information out of galaxy imaging surveys. I pitched my ideas of thinking about how we might train regression models that can predict dark-matter structure from galaxy morphologies or even better large-scale-structure morphologies. And Chisari has (indirect) evidence that such approaches might be very powerful, because (with simulations) she showed (in the context of intrinsic-alignment contamination of weak-lensing data) that even simple measures of galaxy morphology are expected to be very sensitive to the local gravitational tidal field.

One thing that came up in this discussion is my suspicion that ellipticity is a very blunt tool. I have counter-examples that show that ellipticity is not necessarily the galaxy property most sensitive to the weak-lensing field (in an information-theoretic sense). But we formulated a challenge: Make an adversarial morphology distribution for galaxies such that none of the weak-lensing information in the data is in the galaxy ellipticities. That would be hilarious (or instructive, or both).


so many things!

Ahhh research. After a rocky morning, it was a great research day. Bedell (Flatiron) may have fully debugged all the bugs we introduced earlier this week when we audited and changed the handling of bad and low signal-to-noise data in the HARPS spectra. Price-Whelan (Princeton), Bedell, and I tentatively planned to run The Joker on all of the public exoplanet-relevant extreme-precision radial-velocity data there is. At a meeting, Tomer Yavetz (Columbia) showed the parts of phase space that are at the boundaries between resonant and regular orbits, and he finds that these regions (if there are disrupting objects on these orbits) produce stellar streams that are not thin but fan out chaotically. That delivers some more detailed theoretical understanding of results that Sarah Pearson (Flatiron) obtained and understood a few years ago. Pearson herself is looking at the orbits of the red-giant stars from Eilers (MPIA) and me to see if she can just see the bar, kinematically. Birky (UCSD) and I discussed validation of her results with The Cannon on M-dwarf spectra in APOGEE. She finds that some isochrone models are very consistent with our results, and that we can also estimate stellar radii (which is super-relevant for TESS). Kate Storey-Fisher (NYU) and I broke down what we need to do for our correlation-function estimator to a small set of well-defined sub-projects. Next up: Cheaply simulating weak, Gaussian clustering.


gravitational wave inferences

Thursdays are low-research days! But I did have a great conversation with Bonaca (Harvard) about the paper we are writing on the GD-1 stellar stream. We talked about the discussion section: What can we say about black-hole models for the gravitational perturbation we observe? What can we say about the population of perturbers from this one perturbing event?

At the end of the day, Will Farr (Flatiron) gave the Departmental Colloquium about gravitational-wave events, with a focus on statistical inference issues. He made some nice points, including that if Advanced LIGO works according to plans, it will generate enough black-hole and neutron-star inspiral events to solve a bunch of cosmological questions, like the Hubble Constant, whether there are pair-instability supernovae and at what masses, and how black-hole binaries form. That is, it will be routine, high-throughput astronomy! Farr is one of the people responsible for the excellent statistical inference underlying the LIGO results.


more pair-coding; dotastronomy

I got another good pair-coding session in today with Bedell (Flatiron). We had resolved to work on continuum normalization of the HARPS spectra, but instead we ended up working on how to zero-out or delete or censor bad orders and bad epochs of the multi-epoch, multi-order spectra. We came up with simple methods that are hacky but simple and sensible. The whole code seems to be working!

At Stars Meeting, Rocio Kiman (CUNY) told us about her experiences at dotastronomy X, the tenth incarnation of the influential meeting that is the probable origin of hack days, hack weeks, and unconferencing in astrophysics. The short summary is that she loved the meeting and it's culture. Congratulations to the dotastronomy crew, who have changed the world, and Rob Simpson, who started it lo so many years ago.


power-spectrum estimators

Tuesdays are low-research days, but Kate Storey-Fisher (NYU) and I got to reading the classic FKP paper about how to estimate a power spectrum in a galaxy survey. We think we can do better; maybe much better! But we don't yet understand. Late in the day I mentioned all this to Roman Scoccimarro (NYU) and he gave me some better methods than FKP. I am still optimistic that we have something very very new to say!


extreme precision radial-velocity; GD-1; TESS

The highlight of my day was a pair-coding session with Bedell (Flatiron) in which we worked through issues with our code wobble that measures radial velocities in extremely high-resolution multi-epoch spectroscopy. The model includes star and telluric models, and regularizations that constrain unconstrained freedoms. The issues are all related to these regularizations: How to set their values, and why various optimization strategies aren't working. We found a few bugs, made a lot of plots, and experimented. In the end: It looks like it is all working! I am so stoked. This could end up being the key project of the Astronomical Data Group at Flatiron. This working session also strongly endorsed (for me, once again) the value of pair coding.

At lunch time I gave the CCPP Brown-Bag talk about the GD-1 projects I am doing with Bonaca (Harvard) and others. It was fun. Several questions from the audience were about what we can understand about the population of perturbers, from this one perturber. That's a good question, to which I have no (current) answer)

Late in the day, I talked to Ben Pope (NYU) about projects in astronomical time-series imaging. He has nice results that show that independent components analysis might be very valuable; this is something that my former student Dun Wang was interested in. And we also discussed things that relate to speckle imaging, lucky imaging, and interferometry. Can we reconstruct good images from many bad ones? And should we? We resolved to do some experiments with the simulated TESS data.


AstroFest, day 3

Today was the third and final friday of the Gotham AstroFest series, in which we have a very large fraction of the entire astrophysics community in New York City give short talks. This was at NYU, and had contributions from NYU, AMNH, and CUNY scientists. There were a huge number of interesting results in the day. One of the most remarkable things about the day is that fully one quarter of the talks were about black holes. Between NYU and CUNY, there is a lot of research going on related to black holes: Their formation, primordial black holes, their binary dynamics, gravitational-wave signatures, and so on. That's excellent.

A few random highlights for me included: Evidence for weather on brown dwarfs as a function of temperature and gravity by Vos (AMNH), and (relatedly) comparisons between planet and brown-dwarf spectra by Popinchalk (CUNY). It really does appear that there are no strong differences between brown dwarfs and planets (something I discussed with Oppenheimer, AMNH, at lunch). Gandhi (NYU) showed some chemistry and orbits work she has done with Ness (Flatiron) before coming to NYU; that's very related to my interests! Williamson (NYU) visualized a linear SVM, which is beautiful (and old-school). MacFadyen (NYU) convinced us beautifully that his models of the NS—NS merger are really the best!

There was lots on dark-matter detection and dark-matter candidates, including even baryonic and black-hole types. And Tinker (NYU) showed beautiful satellite-galaxy statistics that he got by stacking and background-subtracting galaxy counts in the Legacy Survey imaging for DESI.

If you want to see the full slide deck for the event, it is here.


how to write a discussion section

In a low-research day, a highlight was a long conversation with Bonaca (Harvard) about the writing of her paper on the GD-1 stream interaction. We discussed structure, and especially the discussion. In a discussion, I like a humble sandwich on proud bread: Start by saying what's most impressive about what we've done, then go into caveats, limitations, approximation wrongness, and the consequences of all that. And then end on a positive note about what kinds of great new things this work will enable going forward.

Late in the day, Alex Kusenko (UCLA, IPMU) spoke about a very wide range of subjects. He claims to have a full explanation for why we don't see the cutoff in the gamma-ray occurrence rate required by photon–photon interactions with the infrared background. He claims that the gamma rays we see from blazars are really reprocessed from cosmic rays. Plausible! But I would need to know a lot more. He also claims to have a way to naturally make primordial black holes in the end stages of inflation, and make all of the dark matter that way. That's interesting. Unfortunately it was such a long and tiring day I couldn't get it together to really check either of these ideas carefully.


data science for stars; phase space

Our weekly Stars meeting at Flatiron was a pleasure today, as it usually is. Angus (Columbia) and Contardo (Flatiron) are looking at the possibility that we might be able to deblend binary and overlapping stars in the TESS data by their light curves alone. That's crazy, but just crazy enough that I love it! We discussed different ways they might get a training set for this. Luger (Flatiron) asked whether it might be possible to figure out the ell and em (spherical-harmonic order) of the asteroseismic modes by using projections onto transits. That also led to some good discussions about possible methods; many of the crowd liked the ideas that look like lock-in amplification. Marchetti (Leiden) gave us a nice discussion of the high-velocity star results from Gaia DR2. It's too early: The really exciting results will come in data releases 3 and 4 when the magnitude limit for the RVS data gets fainter.

Matt Buckley (Rutgers) showed Adrian Price-Whelan (Princeton) and me his results on measuring phase-space volumes of bound and disrupted objects. The idea is that you might be able to reconstruct the mass of a disrupted object, and say whether it was dark-matter dominated. And get all the attendant dark-matter-theory consequences of that. He showed (unsurprisingly) that observational noise increases the phase-space volume that you naively measure. So we discussed how to approach this. If we are frequentists, maybe we can just ”greedily“ correct the measurements in the direction that lowers the phase-space volume? If we are Bayesians, we have to make more assumptions, I think!


structure of all models, ever; correlation-function representation

Early in the day I had a long conversation with Leistedt (NYU) about the philosophy of our machine-learning projects. We refined further our view that the machine learning should be part of a larger causal structure that makes sense. My position is that you can think of most (hard) physics problems as having some kind of generalized graphical model with a three high-level boxes. One is called “things I know well but don't care about”, which is things like noise model, instrument model, and calibration parameters. Another is called “things I don't know and don't care about” which is things like foregrounds, backgrounds, and other nuisances. And the last is called “things I don't know and deeply care about”. This last one is our rigid physics model. And the middle one is where the machine learning goes! If we could build models like this very generally, we would be infinitely powerful.

At mid-day, Storey-Fisher and I talked about all the things we could do if we had a correlation function that is not values-in-bins, but was a linear combination of functions. We could look for cosmological gradients. We could do clustering multipoles at small scales, we could estimate the correlation function and power spectrum simultaneously, we could extract Fisher-optimal summary statistics for cosmological parameter estimation. And all these things are possible with our new correlation-function estimator. Next step: Getting the code fast enough to do non-trivial tests.

In the astro seminar at NYU, Savvas Koushiappas (Brown) showed us weak but very interesting evidence that maybe there is a dark-matter annihilation signature in the NASA Fermi data on the Reticulum II dwarf galaxy. Obviously this is incredibly important if it holds up as more data and better calibrations come.


writing; not ready for TESS

I got some actual writing time in today! I worked on places in the Birky (UCSD) paper (on M-dwarf spectral models) where Birky had left me notes marked "HOGG". That's a great tool: She leaves "HOGG" notes; I search for them in my text editor, and I make the relevant changes or add the relevant text.

Late in the day I had a great conversation with Ben Pope (NYU) about things we can do right now or very soon with TESS artificial data or the first data release of full-frame images. We talked about dimensionality reduction methods, like the robust
PCA methods from Candès and related methods that use convex optimization. We also talked about independent components analysis. In general, when the first data arrive, there will be lots of low-hanging fruit. We also discussed what could be done in advance, with the available artificial data.


finishing a paper; latents

I dusted off the draft of my paper with Eilers (MPIA) and Rix (MPIA) about spectrophotometric measurements of red-giant distances or parallaxes using Gaia SDSS APOGEE, 2MASS, and WISE. It is nearly done! But we put it on ice while Eilers finished other things. I worked through more than half of the text, making notes on what small things remain to do.

The biggest to-do item? We have a linear model (for the log distance or log parallax or absolute magnitude). That's sweet, because it is simple, and it is interpretable, at least partially. Now we have to make that true by interpreting. Interpreting a linear model is harder than fitting a linear model!

I also had conversations with Storey-Fisher (NYU) about models for the correlation function and Price-Whelan (Princeton) about Milky Way non-equilibrium dynamical models. On the former, we discussed the difference between the correlation function and any particular estimate of the correlation function. It's a bit complex, because I'm not sure there is even agreement in the community about what would be considered the true latent correlation function in the low-ish redshift Universe.


stream-as-torus; TESS FFIs

I met up early with Price-Whelan (Princeton) to work on the chemical-tangents method papers. This work devolved into rearranging and organizing into categories the to-do list, using GitHub's project tools. That was useful! But it felt a bit like we didn't get anything done. I know that isn't true!

A bit later in the morning we called Jo Bovy (Toronto) to get some advice for Lauren Anderson (Flatiron) on fitting streams in the Milky Way halo. I had been summarizing one of Bovy's papers as saying that streams are close to orbits (that is, you can fit a stream as an orbit) but Bovy corrected us: His paper shows that streams are close to tori. That is, you can expect all the stars in the stream to have similar actions or invariants, but they will not line up as a line on the torus the same way that a single segment of a single orbit would. Duh! That makes good sense and suggests a beautifully simple method for modeling streams with tilted torus sections. I think I almost know how we might do that.

I also checked in with the group working on NASA TESS full-frame images (FFIs), led by Ben Montet (Chicago), who have been hacking at Flatiron all week. They intend to reformat the full-frame images into manageable (and more useful) data objects, extract aperture photometry flexibly, and perform best-in-class de-trending using other stars or other pixels, in the spirit of many things we have done over the years with Kepler data. They really look like a team that might take over the world! For context: The TESS Mission plans to release the raw FFIs with no proprietary period, and they plan to leave it to the community to build open-source (or not!) data-analysis tools around them. Go team!


GD-1 and chemical tangents

Tuesdays and Thursdays are lower on the research this semester! But I did get in two excellent discussions. The first was with Bonaca (Harvard) who has made an absolutely great visualization that compares the Gaia data on GD-1 and her model for GD-1. I think this figure might get featured in a lot of talks! We are still checking things, but it looks great. We discussed what would be the final scope of her paper, because—as with all projects—there is a huge possible scope but we need to finish a paper soon! I'm happy with the scope and it seems achievable and sensible. The big issue is that the constraints we have on the perturber than interacted with GD-1 come from a model that has toy aspects to it, while the full generative stream model is expensive enough that we don't want to go there for inference just yet. Soon, but not for this paper.

Over a late lunch I discussed many things with Price-Whelan (Princeton), both about GD-1 and about our chemical tangents project. On the latter, we discussed (for approximately the millionth time) how to describe the project most compactly. This project is strange enough for the typical astronomer, that we have to think carefully about how we present it. There are a lot of things that sound right but are wrong. And I am a huge believer in repeatedly re-describing projects. I think every time you go through it, you learn something new, and improve your presentation. This is a huge benefit of fully Open Science.

In that spirit: We are trying to find the surfaces in phase space along which the distribution of stars in abundance space is constant. Not the abundance is constant, but the distribution in abundances. Those surfaces contain the orbits! In some sense it comes down to the point that the joint distribution in actions and abundances is not separable, so the abundances can inform you about the actions! But that description is too terse. And Rix likes to say: Stars don't change their abundances as they orbit! So if you have drawn orbits through phase space that would require abundance changes, either your population isn't mixed or else you are wrong about those orbits.


dynamics and chemistry

Today Kathryn Johnston (Columbia) test-drove a group meeting at Flatiron on Dynamics, to which I was honored to be invited. We went around the table and described our current dynamics-related projects. After that, it was Stars Meeting, which was its usual hugeness. At the suggestion of its (rotating) organizers, we are experimenting with different ways of making sure many voices are all involved in the conversation. That's a hard problem!

As Stars meeting many interesting things happened. A highlight for me was Adrian Price-Whelan (Princeton) describing work done at Aspen in the last few weeks on the Orphan stream. It looks for dynamical and chemical reasons like a disrupted dwarf galaxy, and it may fully wrap the Galaxy. Another highlight was a contribution from Victor Debattista (UCLAN) looking at chemical abundances in toy (that is, non-cosmological) simulations of star-forming disk galaxies. He has a new explanation for the bimodality between alpha-rich thick disk and alpha-poor thin disk, and his explanation is general, so it implies (as he explicitly predicts in his new paper) that the bimodality will be observed in all disk galaxies! That's exciting. Of course it is hard to observe.

In other news, Matthew Buckley (Rutgers) showed me really beautiful results, in which he can measure the mass of a globular cluster by using phase-space density or volume information, even in the presence of real data issues. The reason it is hard is that the data quality is extremely anisotropic in phase space. It looks extremely promising. I want to figure out how this relates to old-school methods, like virial methods and caustic methods.


large-scale structure

Tuesdays are low-research days! But I did have a good conversation with Storey-Fisher (NYU) about our correlation-function estimator, and how to precisely test it. It has so many applications! We also discussed how our three projects fit together: Correlation-function estimation, adversarial mock catalogs, and searching for anomalies in the large-scale structure. The middle project—adversarial mocks—is about making mocks that have systematics that would defeat current systematics correction, and also making methods that would defeat even those mocks.


there's no such thing as a Jeans model!

The Jeans Equations are remarkable: They relate moments and integrals of distribution functions to the underlying gravitational potential (or really force law), for phase-mixed populations. They are true for any distribution function! But they are equations, and they are not models. As my loyal reader knows, for me a model is a likelihood function!

When people do what is called Jeans modeling, they turn the equations into some procedure for estimating the gravitational potential (or force law or mass density). And although the Equations are independent of distribution function, the performance of this heuristic procedure—that goes from velocity moments to gravitational model parameters or densities—has statistical properties that do depend strongly on the distribution function. That is, you can't make a probabilistic statement (like a measurement and an uncertainty) of anything (like a density at the Milky Way disk midplane) without assuming things about the distribution function.

And because the Jeans Equations are independent of the distribution function, it is tempting to claim or believe that the results of the inference are also independent of the DF, which they aren't. There is no procedure you can write down that isn't. I spent time this weekend writing words about this, for reasons I can't currently understand.


Gotham AstroFest, day 2

As my loyal reader will recall, there are AstroFest events this September at Columbia (last week), Flatiron (today), and NYU (in two weeks). Todays meeting was long but excellent. I learned many things and was pleased to see all the new faces (so many new faces)! Here are a few personal highlights:

Shy Genel (Flatiron) showed that the details of star formation and feedback affecting a simulated galaxy disk or stars is very sensitive to the initial conditions or perturbations to the conditions made mid-simulation. That caused me to wonder if it is going to be very hard to infer things about galaxies from their observed properties! But Foreman-Mackey (Flatiron) pointed out that the sensitivity might be high but also highly structured, so not necessarily a problem. Good point; but it might take a lot of simulations to find out! Whatever the case, this is an excellent line of research.

Francisco Villaescusa-Navarro (Flatiron) described a project to see if, in the non-linear regime of large-scale structure evolution, the one, two, and higher-point functions, all combined, contain as much information as the one- and two-point functions in the linear regime. That is: What is the information content in the observables? This is, in some sense, the key question of cosmology at the present day! And relates to things I have been thinking about (but doing nothing about) for years.

Suvodip Mukherjee (Flatiron) delivered a beautifully simple (and yet novel) idea: He is looking at all the cosmological observables with gravitational-wave sources that we have with galaxies and the CMB. That's clever! It includes the GW LSW effect, and GW lensing. He pointed out that there might be new cosmological constraints from cross-correlating GW event properties with CMB properties, like the CMB lensing map. Clever! And possibly big, in the mid-term to long-term future.

Doyeon Avery Kim (Columbia) is building spectral-spatial models of the all-sky fields or maps that act as CMB foregrounds. She is doing this by interpolating in spatial and spectral directions the (necessarily incomplete, different sky coverage, different angular resolutions) information from many large-angular-scale surveys. This is also very much related to my (vapor-ware) latent-variable model approach here, and is looking like it is delivering exciting results.



I spoke briefly with Chris Ick (NYU) about quasi-periodic oscillations in Solar flares, Megan Bedell (Flatiron) about telluric lines in stars observed with HARPS, and Adrian Price-Whelan (Princeton) about finding overdensities in the halo in Gaia DR2 data. With Ick we discussed whether to use the Bayesian evidence or a parameter estimate to compare nested models. My loyal reader knows which side I was on! With Bedell we discussed how we might verify that our telluric model is good, using line covariances. With Price-Whelan we discussed how to estimate local overdensity in both position and proper motion that would be maximally sensitive to streams and the like.


dust mapping; information theory and orbits

In Stars Meeting today, visitors Greg Green (KIPAC) and Richard Teague (Michigan) both talked about mapping dust. Teague is working at protoplanetary-disk scale (using velocity maps to find planets), while Green is working at Milky Way scale (making 3-d extinction maps). Teague is working with Foreman-Mackey (Flatiron) to get better velocity maps out of ALMA data and they are getting good success with one of my favorite tricks: Fit the peak with a quadratic. We have shown, in astrometric contexts, that this saturates information-theoretic bounds. They have gorgeous maps!

Green is trying to apply more useful spatial priors to the dust maps he has made of the Milky Way, which are (currently) independently sampled in pixels. He is resampling the pixels, using neighbor information to regularize or as a prior. His method is slow, but a lot faster than using a fully general Gaussian Process prior. And it appears to be a good approximation thereto. Certainly the maps look better!

I presented my project to figure out orbits from chemistry. There was good discussion. Spergel (Flatiron) opined that I would do no better than Jeans modeling if I did the Jeans modeling conditioned on chemistry. I am sure that's wrong! But I have to demonstrate it with a good information-theoretic argument.


new correlation-function estimator

My research tidbit for the day was a long conversation with Kate Storey-Fisher (NYU), in which we discussed our new estimator for the correlation function that can estimate vector (or tensor or higher order) quantities. That is, it doesn't have to estimate the correlation function in bins, it can estimate any aribtrary parameterized representation sensibly. This includes, say, a parameterization that is derivatives with respect to cosmological parameters. That would estimate the cosmological parameters directly from the positions of galaxies! It also includes, say, a fourier representation. That would estimate the correlation function and the power spectrum simultaneously! It also includes, say, dependencies of the correlation function on redshift or position, which would test cosmological growth of structure and cosmological homogeneity. Etc! I'm stoked.

In the course of the discussion we came up with a strong test of the estimator: An affine-invariance test: If we make an affine transformation of the model regressors, do we get the same results at the correlation-function level? That's a great test, and something we can do easily and now. If we don't pass, our estimator is just plain wrong!



For many years, Columbia Astronomy has had a tradition of having everyone in the Department give a short talk in a monster, full-day event called AstroFest. This year, we extended it to three Fridays, and covering all parts of NYC Astronomy. The first of these days was today, at Columbia, and it was great! I learned many things. Here is a smattering:

There is interesting laboratory astrophysics going on at Columbia, including experiments to measure deuterium molecular formation and dissociation rates (reported by Bowen) and experiments to measure aspects of Alfven wave propagation that might be relevant to Solar Coronal heating (reported by Bose).

Spinning black holes in a magnetic field charge up, and this might lead to pulsar-like activity in the late stages of a BH-NS inspiral (reported by Levin). After that I asked if any of the electromagnetic effects might affect the gravitational-wave signal itself, and the answer is unlikely, or only at a very low level.

You can't tell the shape of a transiting object from the shape of the transit (reported by Sandford). There are strict degeneracies! That led the audience to ask about regularization. You can break these degeneracies with regularization, but the answers will depend on the form of that regularization. I was wondering if star spots or limb darkening could break the degeneracies interestingly?

If you slowed down the rotation of the Earth, it would get colder, and more uniform in temperature between equator and pole (reported by Jansen)! That was a great use of Earth climate models to inform the study of exoplanets. And it maybe violates my simplest intuitions. New cure for global warming: Slow down Earth rotation!

And I was only there for the morning.



My only real research today was conversations with Bonaca (Harvard) about her stream–dark-matter substructure collision problem, explaining features in the GD-1 stream. We discussed ways to analyze how important the stream thickness is, without actually building a realistic model of the stream thickness. That is, our simplest simulations treat the stream as arbitrarily thin, but the best impact models may not be clearly in the thin-stream limit.


comoving stars and chaos, tellurics and radial velocities

I'm back in the city and back for the Stars Meeting at Flatiron. It did not disappoint! We went around the room and did long, post-summer introductions. In that process, many good ideas came up! I learned that John Brewer (Yale) has a great result on the metallicity-dependence of the occurrence of various different kinds of planetary systems (currently under review). I learned that Spergel (Flatiron) is pursuing halo binaries at wide separations to look at halo dynamics. And I learned that Kathryn Johnston (Columbia) is thinking about how chaos might affect not just streams but also wide binaries or unbound comoving pairs. Maybe the comoving pairs will highlight regular (non-chaotic) orbits! That would be a super-cool constraint on Milky Way dynamics.

Late in the day I sat down with Bedell (Flatiron) who showed me the current state of our wobble project to model stars and the atmosphere in extreme-precision radial-velocity projects. It looks great! The data are very well described by the model, our statistical regularizations seem to be working, and there is every evidence that we are getting great telluric spectra. Now, are we doing well on the radial-velocity determination? Damn I hope so!


luminous red giants in APOGEE

Eilers (MPIA) and I went on the APOGEE science telecon to describe our results. I talked about how we calibrated a (purely linear) spectro-photometric distance estimate for luminous red giants that manages to correct for dust and luminosity, and Eilers talked about how we used those tracers to measure the circular velocity of the Milky Way disk (that is, the potential). We use the Jeans equation in cylindrical symmetry. We got great feedback from the APOGEE team, which we will use to improve our discussion in our papers.


how to describe my current project

My flight home got seriously delayed and I had an extra day in Aspen. I spent it talking about (and working on) my project to infer dynamical invariants in the well-mixed parts of the Milky Way from chemical (element-abundance) invariants. I had various epiphanies and useful discussions:

Rix (MPIA) and I worked on how you explain the project to the world. One explanation is this: In addition to dynamical invariants, there are chemical abundances, which depend on the dynamical invariants (and not on the conjugate angles). Therefore inference of the dynamical invariants must be better or improved if you model the abundance invariants as well or in tandem. Another explanation is this: Imagine you do a dynamical inference (like Jeans modeling) and you (effectively) determine orbit structure. If you are slightly off, the element abundances you have measured will reveal the issues, and can be used to adjust or update or improve the orbit-structure inference, because stars don't change their abundances as they orbit!

Price-Whelan (Princeton) and I worked on how to compare the project with Jeans modeling, Schwarzschild modeling, or fully marginalized forward modeling of the kinematics (which has almost never been done). I have a scaling argument that my new method must be better than any of these: Each of these methods gets some amount of information out of the positions and velocities of the stars. My chemical-tangents method gets more information from every new element abundance you measure (even if each new element is fully covariant physically with the ones you have measured before; it is the measurements that are near-independent). So in some limit (and I think that limit arrives very early), it will have more information than any of these methods. But of course I need to demonstrate this quantitatively in the very near future.

Another point of comparison is related to the conditional or generative or causal structure of the model: I am modeling the abundance distribution conditioned on the phase-space positions. This means that I don't need to know the selection function of the survey, which Jeans modeling does (to some extent) and the more serious methods do (to great precision). On the other hand, because I am conditioning on the positions and not generating them, I can't (gracefully) account for measurement uncertainties in position. (Of course that's true for Jeans too.)

Anyway, the reason I am writing all this is because: The best practice for writing (the paper) is writing (this blog and emails and etc).


Aspen, day 5

Today was a whirlwind of meetings and sessions. I started with a great conversation with Vasiliev (Cambridge) and Valluri (Michigan) about statistical inference problems related to Schwarzschild modeling a galaxy using an orbit library. I'm not sure I helped much! Right now no-one knows how to realistically marginalize out the effective distribution function, although I think there might be good ideas somewhere in the probabilistic machine-learning world.

We had a plenary discussion of non-steady-state and non-equilibrium aspects of the Milky Way and how we will model or understand them. Fundamentally, we only know how to infer the dynamics of the Milky Way by making strong assumptions: Either (in the case of Jeans or Schwarzschild modeling, say) that there is time symmetry and also cylindrical or spherical symmetries, or (in the case of stream modeling, say) that the stars were put onto orbits in some collectively informative way. Since the real Milky Way violates these assumptions for most stars at some level, we need qualitatively new kinds of assumptions to make. My proposal: That the Milky Way grew from the cosmological initial conditions! That's the right thing, I think, but we don't yet have any tractable way to think about (say) the Gaia data in that context. At least not precisely.

In the afternoon, there was a cross-meeting dark-matter session in which a large set of particle physicists and a large set of astrophysicists interacted over testing dark-matter models. I learned that there is a huge literature we don't know enough about. I am very interested in going down this path, because it connects Gaia and SDSS-V to fundamental properties of the Universe. That's what I would really love to do (and that's in part why my original idea for SDSS-V was called "Disco": Cosmology with the disk).

At some point in the day, I realized that we can test chemical-tangents method (my baby) in fake data! I discussed this with Loebman (Davis) and Price-Whelan (Princeton). I also realized that we can compare it to Jeans modeling, and show (I hope) that it always wins.

At the very end of the day, I had a conversation with Anderson (Flatiron) about advising and mentoring of postdocs. I feel very lost, I have to say: I want to give my attention to all my projects, my attention and time are limited, I don't spend my attention in the right places, I disappoint many of my people, and I impede their progress. I feel like I am doing it wrong! I don't feel like I understand how to be a mentor, and I am starting to feel stressed about it. One of the strange things is that the postdocs with whom I work are both the best and most fun collaborators I have, and also independent, capable scientists. That would seem to make it all easy and fun, but instead it somehow makes it confusing and existential. I went home unhappy from an absolutely great week in Aspen.


Aspen, day 4

Adrian Price-Whelan (Princeton) resolved some of our code differences today as unit or dimensions differences. That was good! But we still have the problem that different elements (in comparison with kinematics) lead to different inferences about orbits in the Milky Way disk. Don't know what to do about that! Either the data are wrong, or there is a big discovery here.

Ana Bonaca (Harvard), Price-Whelan, and I discussed how to build a pseudo-likelihood for comparing the models that Bonaca has for a stream perturbation to the real data. This is a bit of a hard problem, because we want objectives that improve as the agreement improves, but we don't want to build a fully generative model of the data. Why not? because we don't have a good generative model, and perturbations away from a bad generative model could lead to very wrong inferences. All we want, after all, is a rough sense of what kinds of events are consistent with the data.

In the afternoon, I gave the Aspen Center for Physics Colloquium. I spoke about Gaia and dark matter, but I also threw in my thinking about the inference of Solar System dynamics in the 17th Century: We would do it very differently now! I have much more to say but I am too tired to write it here.

Aspen, day 3

Side by side, Price-Whelan (Princeton) and I worked through and discussed code inconsistencies between my code and his code to compute the likelihood function for my chemical-tangents (working title) project, that uses chemical invariants to find dynamical invariants. It was a frustrating discussion, because we couldn't figure out either the issues or how to test. That's our project for tomorrow!

Bonaca (Harvard) reported in: She can show that the gaps and associated loops we find in the GD-1 stream cannot be caused by interaction with a molecular cloud on a disk orbit! That means that the only explanation remaining is dark-matter substructure. Awesome! I'm stoked!


Aspen, day 2

After I found an insanely huge and existential bug in my code, Price-Whelan (Princeton) and I did a full code review of my project that takes advantage of chemical-abundance invariance to determine dynamical invariants. The big issue is that the action computation is expensive; it involves some kind of integration or quadrature. As we were discussing this, Eugene Vasiliev (Cambridge) joined us and suggested ways to speed up the action calculation using the energy invariant to aid in the integration.

[Insert tire screeching noise] We don't need actions! We only need some kind of invariant for this project. Indeed, I have tried various different invariants and they all work equally well. So we can use the energy invariant, which requires no integration! Woah, and thanks Vasiliev! That sped up the code by factors of many hundreds, which we then partially compensated down by doing more complete MCMC samplings. But development cycle is far improved.

I also had a great planning session with Bonaca (Harvard) where we worked out coordinate systems and methods for making far more realistic our project to model the GD-1 stream gaps. We are modeling one of the gaps and spurs with a dark-matter (or really dense-object) interaction. We needed to make things far more realistic, because we want to rule out disk-passage events as gap causes. That requires that we have GD-1 orbiting in a Galaxy with a disk and the observer (us) in the right places at the right times. Things are complicated, because everything happens on the kinematic equivalent of the past light cone—the past star cone?


Aspen, day 1

I'm in Aspen for the week, working on Milky Way dynamics and chemical abundances. The day started with introductions, in which many themes arose. One is that detailed abundances are here, are good, are numerous, and are under-utilized. So the work I am doing fits in pretty well! It occurred to me (Duh!), when some of the more particle-oriented people spoke, that imaging the dark matter is no different from testing gravity, at least conceptually. So I can spin off a testing-gravity side project.

The MCMC runs I sent off working on my laptop on Friday did not disappoint: I got amazingly strong constraints on the dynamics of the disk, including a percent-level measurement of the disk density! That's a precision of course; at percent level none of my assumptions are defensible. But it works really well: The iso-chemical contours really do show you the orbits, and precisely.

But, the most interesting respect in which my assumptions are violated is that the different elements want to put the midplane of the disk in different locations! Huh? And the effect is just clearly visible in the GALAH data. Rix (MPIA) pointed out today that the phase-mixing times can be long near the disk center because in the limit of a harmonic potential, all frequencies are degenerate and mixing doesn't happen. So maybe we have clear evidence for non-phase-mixing, vertically, in the disk. Or of course maybe there are (very adversarial, I might say) issues with the GALAH data. But the nice thing is you can just see it in the element abundances. Look for yourself: The midplane in [Fe/H] looks different from the midplane in [Si/Fe]. (Plots have z-velocity on the horizontal axis and z-position on the vertical axis).


bad development cycle is bad

My day started with a conversation with Christina Eilers (MPIA) about the Milky Way rotation curve. We found some strange kinematics points that might be messing with us, and realized that they are almost all stars at or past the Bulge, and therefore not affecting our results, which are only for Galactocentric radii greater than 5 kpc, to avoid the craziness of the bar (which violates our dynamical assumptions). Her figures are ready, so I encouraged her to write figure captions and assemble the paper.

I spent my research time getting MCMC running on my Chemical Tangents project. I have a marginalized likelihood, so all I had to do is put on priors and insert into emcee. Oh how I would have benefitted from a testing environment! When I packaged it all up for emcee I messed up the units of almost all the inputs, so I got garbage in every MCMC run. And the runs took a long time, so diagnosis was painful. Unit testing. And for units! Live and fail to learn, that's what I say.

Once everything appeared to be working, I set up some (nasty) multi-processing, set my laptop to stay awake all night, and blew processes. I should have converged samplings by morning.


best constraints on disk dynamics ever

In my Chemical-Tangents project (working title only), I am modeling the abundance distribution as a function of dynamical actions to determine the shapes of the orbits in phase space, or, equivalently, the force law or the density distribution or the potential. I have parameters for the force law but also for the abundances and their variation. I spent the morning figuring out how to marginalized out those abundance parameters, which are nuisances (for my purposes). I got it working; much of it is even analytic (so I only have to do one numerical integral per element ratio).

I then ran on everything, and I find that I have the best constraints on the Milky Way disk dynamics, ever! That is, on the kinematic location of the midplane, the scale height of the mass distribution, and the central or mean density. I am pretty stoked: Each abundance ratio individually gives good constraints on these parameters, and their combination will be exceedingly constraining. So I am pretty confident that I have a great project. My next job is to put this all into a sampler (like emcee, which is good for low-dimensional problems) and sample it.