latent variables in generative models

I worked a bit on my project to compare generative and discriminative models. As my loyal reader knows, I love generative models and am suspicious of discriminative models, and I'm trying to understand whether my prejudices are justified in any way. I have been thinking about this in terms of adversarial attacks, and also in terms of information theory. I keep finding (to my delight) that discriminative models are worse on both counts. However, I have been unfair, because I have been fitting generative models that are precisely matched to how the data are themselves generated. I worked out a more general setting today, where the generative model will be appropriately and realistically wrong; will it still be more robust against attacks? The main idea is that in the real world, there are latent variables you can't know, even for your training data. And you not only don't know these latents, you don't even know how many there are, or how nonlinear their effects.



resolving frequency differences

For the first time in the quarantine, I did some actual direct research myself. I made a Jupyter (tm) notebook in which I simulate and analyze a time-domain signal containing an asteroseismic-like forest of coherent oscillators. I then use likelihood methods to see if I can extract or infer the frequency spacing of the coherent modes. The answers are a bit messy, but I think it is possible to measure frequency differences below the “uncertainty-principle” naive limit. That is, I think we (Bonaca and me in this case, but I have also worked on this problem with Feeney, Foreman-Mackey, and others) can resolve differences well below 1/T, where T is the duration of the full set of observations. That is, I think we can do better than the usual method of taking a periodogram and looking at the distances between peaks.


instrument epochs

In another not-much-research-day, I did get in an interesting call with Megan Bedell (Flatiron) and Lily Zhao (Yale) about our project to precisely calibrate the EXPRES spectrograph. Zhao is using our dimensionality reduction to look at instrument changes. She can use it to split the months of instrument use into sensible (what we call) epochs. Each of these epochs has a wavelength calibration with a sensible, low-dimensional representation. So the value of the dimensionality reduction is not just to make the calibration hierarchical, but also to find change points and—more generally—put eyes on the data.


information theory for machine learning

I met early (by videocon, of course) with Teresa Huang (NYU) and Soledad Villar (NYU) to talk about our projects to develop adversarial attacks against regressions of discriminative and generative forms. We ended up talking a bit about information theory. I gave my minimal description of Fisher Information. I was recalling that I was taught some of that back in my PhD, but I forgot it all and re-learned it by using it in real data analyses. I feel like it would be a good subject for an arXiv-only post.

The question to hand today was this: You are given a set of data x that contain information about some quantity y. For a training subset, you are also given labels y, which are noisy. That is, the labels you are given do not exactly match the true values of y. Which contains more information about the true labels? The labels you are given or the data? This is a question answerable (under exceedingly strong assumptions) within information theory.


common-envelope binary stars

[Research barely proceeds during this pandemic. Don't take these posts to be evidence of a lot of research activity here.]

Mathieu Renzo (Flatiron) showed some calculations in Stars & Exoplanets meeting (now virtual) about how common-envelope stars might appear in LISA. The idea is that when stars with compact cores in binary systems orbitally decay, they hit a point at which they must merge into a single star. These sources might be in the LISA band for a while. My loyal reader knows that Adrian Price-Whelan (Flatiron) and I have found some very short-period binary companions in radial-velocity data from APOGEE; some too short-period to be orbiting outside their host stars! We have presumed that these signals are mis-classified asteroseismic oscillations. However, maybe they could be common-envelope? Renzo pointed out that there should be interesting spectral signatures if they are common-envelope. Let's check!


asteroseismology from the ground

By phone I discussed with Ana Bonaca (Harvard) this paper by Auge et al, which (very sensibly) looks at the possibility of doing asteroseismology from the ground. My loyal reader knows that this is something I have been thinking about for a long time (I think it is mentioned in the shouty slide deck linked to from this old blog post), and Bonaca has too. Auge et al show that they can get nu-max from the ground for very luminous (and hence large, and hence large-amplitude-oscillating, and hence long-period) giant stars. Can they also get delta-nu? They say that it is rarely possible. But their tool is something like a Fourier transform followed by searching for peaks. If your main goal was to determine delta-nu, this would not be the tool of choice, I think. Not that I have a tool ready! Bonaca and I resolved to take a look at this problem.


modeling the Milky Way disk

[Okay now I'm really trying to get back to blogging my research. If you out there are having trouble getting things done: I hear you. I have done very little in this time of quarantine, and I'm trying to be kind to myself about it (and not always succeeding). Take care of yourselves out there.]

The only research I did today was a couple of phone calls. The first was with Christina Eilers (MIT) about determining and implementing a selection function for the APOGEE spectroscopic survey, and using it to measure the scale length of the disk. This is a bit of a boring project! But it would lead to lots of follow-on projects. A good selection function makes you very powerful! For example, the spiral structure we see in kinematics would become abundantly clear in stellar density if we had a selection function and an azimuthally-averaged mean model for the disk.

The other phone call was with Jason Hunt (Flatiron). He has a medium-term goal of applying the made-to-measure method of modeling stellar systems to the entirety of the ESA Gaia data set. I love that goal! We discussed changes to M2M to let it be more responsible with noisy and incomplete data. We resolved that Hunt would teach me M2M in our next (remote) meeting.


visualizing the kinematics of the disk

My only research today was reading and signing off on a paper by Jason Hunt (Flatiron) about the kinematics of stars in the Milky Way disk. His innovation was to plot the stars in something akin to action-angle coordinates (guiding-center-position coordinates). It's a good space to look at spiral structure and other velocity substructure. And to compare to simulations. The visualizations in the paper are lovely.


nothing; recovery?

[I've been in and out for many weeks dealing with family crises. Hence the interruption in a blog that has gone essentially uninterrupted since January 2005. Indeed, the interruption made me miss this blog's 15th birthday party on 2020 January 27. I hope to re-start this week. But posting might be spotty, because I'm still in recovery from those crises. Take care of yourselves out there.]

ps. I got nothing done today. But I'm being kind to myself about it!