so many things (I love Wednesdays)

In the stars group meeting at CCA, there was huge attendance today. David Spergel (CCA) opened by giving a sense of the WFIRST GO and GI discussion that will happen this week at CCA. The GI program is interesting: It is like an archival program within WFIRST. This announcement quickly ran into an operational discussion about what WFIRST can do to avoid saturation of bright stars.

Katia Cunha (Observatorio Nacional, Brazil) spoke about two topics in APOGEE. The first is that they have found new elements in the spectra! They did this by looking at the spectra of s-process-enhanced stars (metal-poor ones) and finding strong, unidentified lines. This is exciting, because before this, APOGEE has no measurements of the s process. The second topic is that they are starting to get working M-dwarf models, which is a first, and can measure 13 element abundances in M dwarfs. Verne Smith (NOAO) noted that this is very important for the future use of these spectrographs and exoplanet science in the age of TESS. On this latter point, the huge breakthrough was in improvements to the molecular line lists.

Dave Bennett (GSFC) talked to us about observations of the Bulge with K2 and other instruments to do microlensing, microlensing parallax, and exoplanet discovery. He noted that there isn't a huge difference between doing characterization and doing search: The photometry has to be good to find microlensing events and not be fooled by false positives. He is in NYC this week working with Dun Wang (NYU).

Jeffrey Carlin (NOAO) led a discussion of detailed abundances for Sagittarius-stream stars as obtained with a CFHT spectrograph fiber-fed from Gemini N. These abundances might unravel the stream for us, and inform dynamical models. This morphed into a conversation about why the stellar atmosphere models are so problematic, which we didn't resolve (surprised?). I pitched a project in which we use Carlin's data at high resolution to train a model for the LAMOST data, as per Anna Y. Q. Ho (Caltech), and then do science with tens of thousands of stars.

In the cosmology group meeting, we discussed the possibility of evaluating (directly) the likelihood for a CMB map or time-ordered data given the C-ells and a noise model. As my loyal reader knows, this requires not just performing solve (inverse multiplication) operations but also (importantly) determinant evaluations. For the discussion, mathematicians Mike O'Neil (NYU) and Leslie Greengard (CCA) and Charlie Epstein (Penn) joined us, with Mike O’Neil leading the discussion about how we might achieve this, computationally. O’Neil outlined two strategies, one of which takes advantage of a possible HODLR form (Ambikasaran et al), another of which takes advantage of the spherical-harmonics transform. There was some disagreement about whether the likelihood function is worth computing, with Hogg on one end (guess which) and Naess and Hill and Spergel more skeptical. Spergel noted that if we could evaluate the LF for the CMB, it opens up the possibility of doing it for LSS or intensity mapping in a three-dimensional (thick) spherical shell (think: redshift distortions and fingers of god and so on).

Between meetings, I discussed deconvolutions of the TGAS color-magnitude diagram with Leistedt and Anderson, and low-hanging fruit in the comoving-star world with Oh and Price-Whelan.


unsupervised models of stars

I am very excited these days about the data-driven model of stellar spectra that Megan Bedell (Chicago) and I are building. In its current form, all it does is fit multi-epoch spectra of a single star with three sets of parameters, a normalization level (one per epoch) times a wavelength-by-wavelength spectral model (one parameter per model wavelength) shifted by a Doppler Shift (one per epoch). This very straightforward technology appears to be fitting the spectra to something close to the photon noise limit (which blows me away). The places where it doesn't fit appear to be interesting. Some of them are telluric absorption residuals, and some are intrinsic variations in the lines in the stellar spectra that are sensitive to activity and convection.

Today we talked about scaling this all up; right now we can only do a small part of the spectrum at a time (and we have a few hundred thousand spectral pixels!). We also spoke about how to regress the residuals against velocity or activity. The current plan is to investigate the residuals, but of course if we find anything we should add it in to the generative model and re-start.



Not much research today, but I did have conversations with Lauren Anderson (Flatiron) about deconvolving the observed (by Gaia TGAS and APASS) color-magnitude diagram of stars, with Leslie Greengard (Flatiron) and Alex Barnett (Dartmouth) about cross-over activities between CCA and CCB at Flatiron, and with Kyle Cranmer (NYU) about his immense NSF proposal.


#hackAAS at #aas229

Today was the (fifth, maybe?) AAS Hack Day; it was also the fifth day of #aas229. As always, I had a great time and great things happened. I won't use this post to list everything from the wrap-up session, but here are some personal, biased highlights:

Inclusive astronomy database
Hlozek, Gidders, Bridge, and Law worked together to create a database and web front-end for resources that astronomers can read (or use) about inclusion and astronomy, inspired in part by things said earlier at #aas229 about race and astronomy. Their system is just a prototype, but it has a few things in it and it is designed to help you find resources but also add resources.
Policy letter help tool
Brett Morris led a hack that created a web interface into which you can input a letter you would like to write to your representative about an issue. It searches for words that are bad to use in policy discussions and asks you to change them, and also gives you the names and addresses of the people to whom you should send it! It was just a prototype, because it turns out there is no way right now to automatically obtain representative names and contact information. That was a frustrating finding about the state of #opengov.
Budget planetarium how-to
Ellie Schwab and a substantial crew got together a budget and resources for building a low-buck but fully functional planetarium. One component was WWT, which is now open source.
Differential equations
Horvat and Galvez worked on solving differential equations using basis functions, to learn (and re-learn) methods that might be applicable to new kinds of models of stars. They built some notebooks that demonstrate that you can easily solve differential equations very accurately with basis functions, but that if you choose a bad basis, you get bad answers!
K2 and the sky
Stephanie Douglas made an interface to the K2 data that show a postage stamp from the data, the light curve, and then aligned (overlaid, even) imaging from other imaging surveys. This involved figuring out some stuff about K2's world coordinate systems, and making it work for the world.
Poster clothing
Once again, the sewing machines were out! I actually own one of these now, just for hack day. Pagnotta led a very successful sewing and knitting crew. Six of the team used a sewing machine for the first time today! In case you are still stuck in 2013: The material for sewing is the posters, which all the cool kids have printed on fabric, not paper these days!
Erik Tollerud built some tools for the long-term storage and archiving of #hackAAS hacks. These leverage GitHub under the hood.

There were many other hacks, including people learning how to use testing and integration tools, people learning to use the ADS API, people learning how to use version control and GitHub, testing of different kinds of photometry, and visualization of various kinds of data. It was a great day, and I can't wait for next year.

Huge thanks to our corporate sponsor, Northrop Grumman, and my co-organizers Kelle Cruz, Meg Schwamb, and Abigail Stevens. NG provided great food, and Schwamb did a great job helping everyone in the room understand the (constructive, open, friendly, fun) point of the day.


#aas229, day 4

I arrived at the American Astronomical Meeting this morning, just in time (well a few minutes late, actually) for the Special Session on Software organized by Alice Allen (ASCL). There were talks about a range of issues in writing, publishing, and maintaining software in astrophysics. I spoke about software publications (slides here) and software citations. Not only were the ideas in the session diverse, the presenters had a wide range of backgrounds (three of them aren't even astronomers)!

There were many interesting contributions to the session. I was most impressed with the data that people are starting to collect about how software is built, supported, discovered, and used. Along those lines, Iva Momcheva (STScI) showed some great data she took about how software projects are funded and built. This follows great work she did with Erik Tollerud (STScI) on how software is used by astronomers (paper here). In their new work, they find that most software is funded by grants that are not primarily (or in many cases not even secondarily) related to the software, and that most software is written by early-career scientists. These data have great implications for the next decade of astrophysics funding and planning. In the discussion afterwards, there were comments about how hard it is to fund the maintenance of software (something I feel keenly).

Similarly, Mike Hucka (Caltech) showed great results he has on how scientists discover software for use in their research projects (paper here). He finds (surprise!) that documentation is key, but there are many other contributing factors to make a piece of research software more likely to be used or re-used by others. His results have strong implications for developers finishing software projects. One surprising thing is that scientists are less platform-specific or language-specific in their needs than you might think.

I spent part of the afternoon hiding in various locations around the meeting, hacking on an unsupervised data-driven model of stellar spectra with Megan Bedell (Chicago).


making slides

My only real research accomplishment today was to make slides for my AAS talk on software publications, which is for a special session organized by Alice Allen (ASCL). The slides are available here.


carbon stars, regulation of star formation, and so much more

Rix called me to discuss the problem that when we compare the chemical abundances in pairs of stars, we get stars that are more identical than we expect, given our noise model for chemical abundances. That is, we see things with chi-squared (far) less than the number of elements. This means (I think) that our noise estimation is overly conservative: There are (at least some) stars that we are observing at very good precision. Further evidence for my view is that there are more such (very close) pairs within open clusters than across open clusters (or in the field).

In stars group meeting, Jill Knapp (Princeton) spoke about Carbon stars (stars with more carbon than oxygen, and I really mean more in counts of atoms). She discussed dredge-up and accretion origins for these, and how we might distinguish these. She has some results on the abundance of Carbon stars as a function of expected (from stellar models) surface-convection properties, which suggest accretion origins. But it is early days.

Chang-Goo Kim (Princeton) told us about simulations that are designed to understand the regulation of star formation in galaxy disks (kpc scales). He pointed out the importance of gravity in setting the star-formation rate; these arguments are always reminiscent (to me) of the Eddington argument. His simulations include supernovae feedback in the form of mechanical and radiation energy, and magnetic turbulence and cosmic ray pressure. He emphasized that conclusions about feedback-regulated star formation depend strongly on assumptions about spatial correlations and locations (think escape over time) of the supernovae relative to the dense molecular cloud in which the star formation occurs. Fundamentally the thing that sets the star-formation rate is the pressure, which can be hydrostatic or turbulent or both.

Semyeong Oh (Princeton) and I led a discussion on the lowest-hanging fruit for projects that exploit her comoving star (and group) catalog from TGAS. Some of the lowest-hanging include investigations of the locations of the pairs in phase space, to look at heating, age, and formation mechanisms.


deconvolution of labels

Lauren Anderson (CCA) and I discussed the state of our project to put spectroscopic parameters onto photometrically discovered stars using colors and magnitudes from APASS, parallaxes from Gaia TGAS, and spectroscopic parameters from the RAVE-on Catalog. We want to take the nearby neighbors in color-magnitude space and deconvolve their noisy spectroscopic parameters to make a less noisy estimate for (what you might call) the test objects. We have been using extreme deconvolution (Bovy et al.) for this, deconvolving the labels for the nearest neighbors (weighted by a likelihood). That is, find neighbors first, deconvolve second. After hours staring at the white board, we decided that maybe we should just deconvolve all the inputs up front, and do inference under the prior created by that deconvolution. Question: Is this computationally feasible?


stars, planets, SPHEREx, and black-hole dark matter

In the last stars group meeting of the year, we had special guests John Brewer (Yale, AMNH) talking about the chemical abundances of stars hosting planets, Ellie Schwab (CUNY) talking about magnetic activity in low-mass stars and brown dwarfs, and Jackie Faherty (AMNH) talking about searches for long-period companions to solar-like stars. Brewer killed the diamond-planet hypothesis that was so cool a few years ago. Ue-Li Pen (CITA) commented to Schwab that 21-cm surveys (see yesterday's post) will and even already do have time-domain radio observations of thousands to millions of stars. And Faherty showed that searches for long-period companions have been incredibly productive, even though they haven't led to exoplanet discoveries (yet).

In the last cosmology group meeting of the year, we had special guests Roland de Putter (Caltech) talking about the observing plans for SPHEREx and Yacine Ali-Haimoud (JHU) talking about black-hole dark matter (my favorite theory of dark matter). SPHEREx performs a very cleverly designed 0.75-5 micron all-sky low-res spectral survey of every point on the sky. It will get redshifts for hundreds of millions of sources, with small photometric-redshift uncertainties. He talked about primordial non-linearity; the survey will get limits or a detection of fNL of <1. The audience was very interested in foregrounds, including Milky-Way stars, and even the zodiacal light in the Solar System.

Ali-Haimoud spoke about 2-body and 3-body effects in a black-hole theory of dark matter to get rates for LIGO. With careful re-analysis, he revises (heavily) the Ricotti et al 2008 limit on BHs as a dark-matter candidate and greatly weakens the constraints from CMB spectral distortions and anisotropies. But in the end he was very careful not to endorse black holes as a dark-matter canadidate. I'm stoked nonetheless!


reconstructing the initial conditions

Today was a 21-cm cosmology meeting at Flatiron. Unfortunately I could only do the morning. Ue-Li Pen (CITA) spoke about reconstruction using a simplified dynamics. I suggested that anything that could be done with simplified dynamics could be done better with machine learning. I think I can even prove this, since the machine learning could be trained on the residuals away from the simplified dynamics reconstruction! In his talk, however, he mentioned this incredible Wang et al paper that does full reconstruction of the initial conditions for the entire SDSS Main Sample volume! This gives me hope for the future of cosmology.


small planets are all rocky?

At lunch today, Angie Wolfgang (PSU) gave a talk at the CCA on hierarchical inference of small and rocky exoplanet population properties. She made a nice set of arguments for the hierarchical Bayesian methodology, which was preaching to the converted (but good). She showed her results on exoplanet compositions and H/He envelopes, both of which are impressive, and then she went on to look at parametric and non-parametric fitting of the mass–radius relationship at small radii. She excludes zero mass scatter at fixed radius at all except the smallest radii. There is a consistent story emerging that the very smallest planets are indeed (pretty much) all rocky.

I had some back-and-forth with Megan Bedell (Chicago) about derivatives of our spectral model with respect to parameters. She has this all correct but I recommended parameterization changes, and the whole time, in the background, Dan Foreman-Mackey (UW) was saying things like “you should never take your own derivatives” and “you should use Theano”. I ignored him, probably at my peril.



Andy Casey (Cambridge) organized a call today in which we discussed his project to re-factor the ESO HARPS archive of high-resolution stellar spectra into a more useful form. We discussed our needs for this new archive for our own personal science projects. The idea is to have the schema and work on the archive be directly related to short-term science goals, and then hope that it will be useful for many other goals.


map-making, self-calibration, stellar rotation

In my stars group meeting at CCA, Ruth Angus (Columbia) told us about her work to replace standard methods for determining stellar rotation with a probabilistic model, based on Gaussian Processes with a quasi-periodic kernel. This seems to work extremely well! There are some pesky outlier stars, even in simulated data, in which all methods (including Angus's best) seem to get the wrong answer for the stellar rotation; these are interesting for further investigation.

In my cosmology group meeting, I got fully taken to school. First Brian Keating (UCSD) told us about self-calibration of CMB polarimetry. It turns out that you can't self-calibrate for some of the most important science. That is, your signal might be in exactly the modes to which the self-calibration is orthogonal! That's bad. And an aspect of self-calibration that I haven't thought about before. He discussed many crazy and creative ways to do absolute calibration of polarimetry devices; none of them look good enough (or cheap enough) at this point.

Then Colin Hill (Columbia) told us about map-making things he is working on in CMB data. I got all crazy because he is only considering linear combinations of observed data to (say) produce the thermal S-Z map (from, say, Planck data). But then he pointed out (correctly) that all least-squares methods return linear combinations of the data! Oh duh! All L2-like methods return linear combinations of the data. So then we went on to think about combined L1 and L2 methods that could permit him to open up his model space without enormously over-fitting. At the end of the discussion I had a job to do: Write down the largest class of convex map-making methods I can, given what I know about L1 and L2.

In between group meetings Cameron Hummels (Caltech) talked about open-source codes he is building that take simulation outputs from cosmological hydro simulations and predict observables, especially those that relate to the inter-galactic medium and circum-galactic medium. We talked a lot about the differences between resolution effects and sub-grid physics choices, which are confusingly inter-related.


Spitzer Oversight, day 2

Today was the second day of the Spitzer Oversight Committee meeting. Far and away the highlight for me was a presentation by Sean Carey (IPAC) about the overall health of the spacecraft and the imaging instrument. He showed the photometric throughput of the system as a function of time, which has been amazingly constant (sub-percent level), and yet showing a consistent and repeatable trend (of less than one mmag per year, ish?). He showed the bad pixel count, which has risen linearly with time, but only to a few hundred (and many of those are nonetheless still calibrated and useful pixels). He showed the astrometric wobble and drifts, associated with spacecraft thermal events. These are substantial, but changing, as the spacecraft goes through more and more extreme solar-angle events to downlink its data to the Deep Space Network. Spitzer is in an Earth-trailing orbit, so as time goes on, it has to point at worse and worse Sun angles to send its data home.

The latter point is very interesting: Carey showed that the batteries are still just as good as new, but that the spacecraft draws 11 A (yes, 11 Amps at 300 Volts!) on average (don't ask why), and when it is in downlink to Earth, the solar panels are not getting enough insolation to cover it. This leads to non-trivial scheduling in which the spacecraft must point near-orthogonal to the Sun vector for a while (hours) after downlink. This complexity is handled without issues by the non-trivial scheduling systems, and the overall spacecraft health is excellent. The mission can run until March 2019, when it is cut off, both by funding and by these Sun-angle issues. Carey also showed the long-term performance of the solar panels. This also declines linearly with time, but is easily within spec to keep the mission running (barring any severe micrometeorite hit). Knock wood!

It has been a great two days, with an absolutely great team, working on an absolutely great mission.