Scott Collard of NYU Libraries organized an interdisciplinary panel across all of NYU today to discuss open research. I often talk about “open science”, but this discussion was explicitly to cover the humanities as well. We talked about the different cultures in different fields, and the roles of funding agencies, universities, member societies, journals, and so on. One idea that I liked from the conversation was that participants should try to ask what they can do from their position and not try to ask what other people should do from theirs. We had recommendations for making modifications to NYU promotion and tenure, putting open-research considerations into annual merit review, and asking the Departments to think about how, in their field, they could move to the most open edge of what's acceptable and conventional. Another great idea is that open research is directly connected to ideas of inclusion and equity, especially when research is viewed globally. That's important.
Today Teresa Huang (JHU) re-started our conversations about adversarial attacks against popular machine-learning methods in astrophysics. We started this project (ages ago, now) thinking about test-time attacks: You have a trained model, how does it fail you at test time? But since then, we have learned a huge amount about training-time attacks: If you add a tiny change to your training data, can you make a huge change to your model? I think some machine-learning methods popular in astronomy are going to be very susceptible to both kinds of attacks!
When we discussed these ideas in the before times, one of the objections was that adversarial attacks are artificial and meaningless. I don't agree: If a model can be easily attacked, it is not robust. If you get a strange and interesting result in a scientific investigation when you are using such a model, how do you know you didn't just get accidentally pwned by your noise draw? Since—in the natural sciences—we are trying to learn how the world works, we can't be putting in model components or pipeline components that are capable of leading us very seriously astray.
Conor Sayres (UW) and I spoke again today about the fiber positioning system (fiber robots) that lives in the two focal planes of the two telescopes that take data for SDSS-V. One of the many things we talked about is how precisely do we need to position the fibers, and how accurately will we be able to observe their positions in real time. It's interestingly marginal; the accuracy with which the focal-plane viewing system (literally a camera in the telescope that looks the wrong way) will be able to locate the fiber positions depends on details that we don't yet know about the camera, the optics, the fiducial-fiber illumination system, and so on. There are different kinds of sensible procedures for observatory operations that depend very strongly on the accuracy of the focal-viewing system.
If you have a set of vectors, what are all the scalar functions you can make from those vectors? That is a question that Soledad Villar (JHU) and I have been working on for a few days now. Our requirements are that the scalar be rotationally invariant. That is, the scalar function must not change as you rotate the coordinate system. Today Villar proved a conjecture we had, which is that any scalar function of the vectors that is rotationally invariant can only depend on scalar products (dot products) of the vectors. That is, you can replace the vectors with all the dot products and that is just as expressive.
After that proof, we argued about vector functions of a set of vectors. Here it turns out that there are a lot more options if you want your answer to be equivariant (not invariant but equivariant) to rotations than if you wnt your answer to be equivariant to rotations and parity swaps. We still don't know what our options are, but because it's so restrictive, I think parity is a good symmetry to include.
I had a great conversation with Andy Casey (Monash) at the end of the day. We discussed many things related to APOGEE and SDSS-V. One of the things I need is the code that makes the synthetic (physical model) spectra for the purposes of obtaining parameter estimates in APOGEE and the derivatives of that model with respect to stellar parameters. That is, I want the physical-model derivatives of spectral expectation with respect to parameters (like temperature, surface gravity, and composition). It turns out that, at this point, the model is a set of synthetic spectra generated on a grid in parameter space! So the model is the grid, and the derivatives are the slopes of a cubic-spline interpolation (or something like that). I have various issues with this, but I'll be fine.
I've had the pleasure of serving on the PhD committee of Shengqi Yang (NYU) who defended her PhD today. She worked on a range of topics in cosmological intensity mapping, with a concentration on the aspects of galaxy evolution and galaxy formation that are important to understand in connecting the intensity signal to the cosmological signal. But her thesis was amazingly broad, including theoretical topics and making observational measurements, and also ranging from galaxy evolution to tests of gravity. Great stuff, and a well-earned PhD.
Today Jason Cao (NYU) defended his PhD on the galaxy–halo connection in cosmology. He has built a stochastic version of subhalo abundance matching that has a stochastic component, so he can tune the information content in the galaxies about their host halos. This freedom in the halo occupation permits the model to match more observations, and it is sensible. He also explored a bit the properties of the dark-matter halos that might control halo occupation, but he did so observationally, using satellite occupation as a tracer of halo properties. These questions are all still open, but he did a lot of good work towards improving the connection between the dark sector and the observed galaxy populations. Congratulations, Dr Cao; welcome to the community of scholars!
At the end of the day I met with Conor Sayres (UW) to discuss the problem of measuring the position of focal-plane fiber-carrying robots given images from in-telescope cameras (focal viewing cameras) inside the telescopes that are operating the SDSS-V Project. We have not installed the fiber robots yet, but Sayres has a software mock-up of what the focal viewing camera will see and all its optics. We also discussed some of the issues we will encounter in commissioning and operation of this viewing system.
Later, in the night, I worked on data-driven transformations between focal-plane position (in mm) in the telescope focal plane and position in the focal viewing camera detector plane (in pixels). I followed the precepts and terminology described in this paper on interpolation-like problems. My conclusion (which agrees with Sayres's) is that if these simulations are realistic, the fitting will work well, and we will indeed know pretty precisely what all the fiber robots are doing.
Today Kate Storey-Fisher (NYU) and I spent more time working with Weichi Yao (NYU) and Soledad Villar (JHU) on creating a good, compact, but real test problem for gauge-invariant graph neural networks. We discussed a truly placeholder toy example in which we ask the network to figure out the identity of the most-gravitationally-bound point in a patch of a simulation. And we discussed a more real problem of inferring things about the occupation or locations of galaxies within the dark-matter field. Tomorrow Storey-Fisher and I will look at the IllustrisTNG simulations, which she has started to dissect into possible patches for Yao's model.
A lot of conversations in the Dynamics group at Flatiron recently have been about spirals: Spirals in phase space, spirals in the disk, even spirals in the halo. In general, as a perturbed dynamical system (like a galaxy or a star cluster) evolves towards steady-state, it goes through a (or more than one) spiral phase. We've (collectively) had an interest in unwinding these spirals, to infer the initial conditions or meta-data about the events that caused the disequilibrium and spiral-winding. Jason Hunt (Flatiron) discussed these problems with Adrian Price-Whelan (Flatiron) and me today, showing some attempts to unwind (what I call) The Snail. That led to a long conversation about what would make a good “loss function” for unwinding. If something was unwinding well, how would we know? That led to some deep conversations.
I got some real hacking time in this afternoon with Gaby Contardo (Flatiron). We worked through some of the code issues and some of the conceptual issues behind our methods for finding gaps in point clouds using (what I call) geometric data analysis, in which we find critical points (saddles, minima, maxima) and trace their connections to map out valleys and ridges. We worked out a set of procedures (and tested some of them) to find critical points, join them up with constrained gradient descents, and label the pathways with local meta-data that indicate how “gappy” they are.
Adrian Price-Whelan is building a next-generation spectrophotometric distance estimation method that builds on things that Eilers, Rix, and I did many moons ago. Price-Whelan's method splits the stars up in spectrophotometric space and builds local models for different kinds of stars. But within those local patches, it is very similar to what we've done before, just adding some (very much) improved regularization and a (very much) improved training set. And now it looks like we might be at the few-percent level in terms of distance precision! If we are, then the entire red-giant branch might be just as good for standard-candlyness as the red clump. This could really have a big impact on SDSS-V. We spent part of the day making decisions about spectrophotometric neighborhoods and other methodological hyper-parameters.
Today we had the second in a series of telecons to discuss how we get, confirm, adjust, and maintain the mapping, in the SDSS-V focal planes (yes there are two!) between the commands we give to the fiber-carrying robots and the positions of the target stellar images. It's a hard problem! As my loyal reader might imagine, I am partial to methods that are fully data-driven, and fully on-sky, but their practicality depends on a lot of prior assumptions we need to make about the variability and flexibility of the system. One thing we sort-of decided is that it would be good to get together a worst-case-scenario plan for the possibility that we install these monsters and we can't find light down the fibers.
I am in a project with Weichi Yao (NYU) and Soledad Villar (NYU) to look at building machine-learning methods that are constrained by the same symmetries as Newtonian mechanics: Rotation, translation, Galilean boost, and particle exchange, for examples. Kate Storey-Fisher (NYU) joined our weekly call today, because she has ideas about toy problems we could use to demonstrate the value of encoding these symmetries. She steered us towards things in the area of “halo occupation”, or the question of which dark-matter halos contain what kinds of galaxies. Right now halo occupation is performed with very blunt tools, and maybe a sharp tool could do better? We would have the advantage (over others) that anything we found would, by construction, obey the fundamental symmetries of physical law.
At the end of the day I had a wide-ranging conversation with Andy Casey (Monash) about all things spectroscopic. I mentioned to him my new interest in domain adaptation, and whether it could be used to build data-driven models. The SDSS-V project has two spectrographs, at two different telescopes, each of which observes stars down different fibers (which have their own idiosyncracies). Could we build a data-driven model to see what any star observed down one fiber of one spectrograph would look like if it had been observed down any other fiber or any fiber of the other spectrograph? That would permit us to see what systematics are spectrograph-specific, and whether we would have got the same answers with the other spectrograph, and other questions like that.
There are some stars observed multiple times and by both observatories, but I'm kind-of interested in whether we could do better using the huge number of stars that haven't been observed twice instead. Indeed, it isn't clear which contains more information about the transformations. Another fun thing: The northern sky and the southern sky are different! We would have to re-build domain adaptation to be sensitive to those differences, which might get into causal-inference territory.
Over the last few weeks—and the last few decades—I have had many conversations about all the things that are way more important to being a successful astrophysicist than facility with electromagnetism and quantum mechanics: There's writing, and mentoring, and project design, and reading, and visualization, and so on. Today I fantasized about a (very long) book entitled The Practice of Astrophysics that covers all of these things.
Adrian Price-Whelan (Flatiron) and I encountered an interesting conceptual point today in our distance estimation project: When you are doing cross-validation to set your hyper-parameters (a regularization strength in this case), what do you use as your validation scalar? That is, what are you optimizing? We started by naively optimizing the cost function, which is something like a weighted L2 of the residual and an L2 of the parameters. But then we switched from the cost function to just the data part (not the regularization part) of the cost function, and everything changed! The point is duh, actually, when you think about it from a Bayesian perspective: You want to improve the likelihood not the posterior pdf. That's another nice point for my non-existent paper on the difference between a likelihood and a posterior pdf. It also shows that, in general, the data and the regularization will be at odds.
Sarah Blunt (Caltech) crashed Stars & Exoplanets Meeting today. She told us about her ambitious, community-built orbitize project, and also results on a mysterious binary-star system, HD 104304. This is a directly-imaged binary, but when they took radial-velocity measurements, the mass of the primary is way too high for its color and luminosity. The beauty of orbitize is that it can take heterogeneous data, and it uses brute-force importance sampling (like my one true love The Joker), so she can deal with very non-trivial likelihood functions and low signal-to-noise, sparse data.
The crowd had many reactions, one of which is that probably the main issue is that ESA Gaia is giving a wrong parallax. That's a boring explanation, but it opens a nice question of using the data to infer or predict a distance, which is old-school fundamental astronomy.
I had a nice meeting (in person, gasp!) with Alberto Bolatto (Maryland) about his beautiful results in the EDGE-CALIFA survey of galaxies, and (yes) patches of galaxies. Because they have an IFU, they can look at relationships between gas, dust, composition, temperature, star-formation rate, mean stellar age, and so on, both within and across galaxies. He asked me about some difficult situations in undertanding empirical correlations in a high dimensional space, and (even harder) how to derive causal conclusions. As my loyal reader might guess, I wasn't much help! I handed him a copy of Regression and Other Stories and told him that it's going to get harder before it gets easier! But damn what a beautiful data set.
Against my better judgement, I am writing a paper on the question of whether we live inside a computer simulation. Today I was discussing this with Paula Seraphim (NYU), who has been doing research with me on this subject. We decided to re-scope the paper around the question “Is the simulation hypothesis a physics question?” instead of the direct question “Do we live in a simulation?”, which can't be answered very satisfactorily. But I think when you flow it down, you conclude that this question is, indeed, a physics question! And the simulation hypothesis motivates searches for new physics in much the same way that the dark matter and inflation do: The predictions are not specific, but there are general signatures to look for.
I am trying to re-state the problem of putting labels on SDSS-IV APOGEE spectra as a transfer learning problem, since the labels come from (slightly wrong) stellar models. Or maybe domain adaptation. But the form of the problem we face in astronomy is different from that faced in most domain-adaptation contexts. The reasons are: The simulated stars are on a grid, not (usually) drawn from a realistically correct distribution. There are only labels on the simulated data, not on the real data (labels only get to real data through simulated data). And there are selection effects and noise sources that are unique to astronomy.
Building on conversations we had yesterday about the geometry and topology of gradients of a scalar field, Gaby Contardo (Flatiron) and I worked out at the end of the day today that valleys of a density field (meaning here a many-times differentiable smooth density model in some d-dimensional space) can be traced by looking for paths along which the density gradient has zero projection onto the principal component (largest-eigenvalue eigenvector) of the second-derivative tensor (the Hessian, to some). We looked at some toy-data examples and this does look promising as a technique for tracing or finding gaps or low-density regions in d-dimensional point clouds.
Teresa Huang's paper with Soledad Villar and me got a very constructive referee report, which led to some discoveries, which led to more discoveries, which led to a massive revision and increase in scope. And all under deadline, as the journal gave us just 5 weeks to respond. It is a really improved paper, thanks to Huang's great work the referee's inspiration. Today we went through the changes. It's hard to take a paper through a truly major revision: Everything has to change, including the parts that didn't change! Because: Writing!