Hogg's Research

2020-03-31

latent variables in generative models

I worked a bit on my project to compare generative and discriminative models. As my loyal reader knows, I love generative models and am suspicious of discriminative models, and I'm trying to understand whether my prejudices are justified in any way. I have been thinking about this in terms of adversarial attacks, and also in terms of information theory. I keep finding (to my delight) that discriminative models are worse on both counts. However, I have been unfair, because I have been fitting generative models that are precisely matched to how the data are themselves generated. I worked out a more general setting today, where the generative model will be appropriately and realistically wrong; will it still be more robust against attacks? The main idea is that in the real world, there are latent variables you can't know, even for your training data. And you not only don't know these latents, you don't even know how many there are, or how nonlinear their effects.

2020-03-30

nope

No research of note today. It's a hard time.

2020-03-29

resolving frequency differences

For the first time in the quarantine, I did some actual direct research myself. I made a Jupyter (tm) notebook in which I simulate and analyze a time-domain signal containing an asteroseismic-like forest of coherent oscillators. I then use likelihood methods to see if I can extract or infer the frequency spacing of the coherent modes. The answers are a bit messy, but I think it is possible to measure frequency differences below the “uncertainty-principle” naive limit. That is, I think we (Bonaca and me in this case, but I have also worked on this problem with Feeney, Foreman-Mackey, and others) can resolve differences well below 1/T, where T is the duration of the full set of observations. That is, I think we can do better than the usual method of taking a periodogram and looking at the distances between peaks.

2020-03-27

instrument epochs

In another not-much-research-day, I did get in an interesting call with Megan Bedell (Flatiron) and Lily Zhao (Yale) about our project to precisely calibrate the EXPRES spectrograph. Zhao is using our dimensionality reduction to look at instrument changes. She can use it to split the months of instrument use into sensible (what we call) epochs. Each of these epochs has a wavelength calibration with a sensible, low-dimensional representation. So the value of the dimensionality reduction is not just to make the calibration hierarchical, but also to find change points and—more generally—put eyes on the data.

2020-03-26

information theory for machine learning

I met early (by videocon, of course) with Teresa Huang (NYU) and Soledad Villar (NYU) to talk about our projects to develop adversarial attacks against regressions of discriminative and generative forms. We ended up talking a bit about information theory. I gave my minimal description of Fisher Information. I was recalling that I was taught some of that back in my PhD, but I forgot it all and re-learned it by using it in real data analyses. I feel like it would be a good subject for an arXiv-only post.

The question to hand today was this: You are given a set of data x that contain information about some quantity y. For a training subset, you are also given labels y, which are noisy. That is, the labels you are given do not exactly match the true values of y. Which contains more information about the true labels? The labels you are given or the data? This is a question answerable (under exceedingly strong assumptions) within information theory.

2020-03-25

common-envelope binary stars

[Research barely proceeds during this pandemic. Don't take these posts to be evidence of a lot of research activity here.]

Mathieu Renzo (Flatiron) showed some calculations in Stars & Exoplanets meeting (now virtual) about how common-envelope stars might appear in LISA. The idea is that when stars with compact cores in binary systems orbitally decay, they hit a point at which they must merge into a single star. These sources might be in the LISA band for a while. My loyal reader knows that Adrian Price-Whelan (Flatiron) and I have found some very short-period binary companions in radial-velocity data from APOGEE; some too short-period to be orbiting outside their host stars! We have presumed that these signals are mis-classified asteroseismic oscillations. However, maybe they could be common-envelope? Renzo pointed out that there should be interesting spectral signatures if they are common-envelope. Let's check!

2020-03-24

asteroseismology from the ground

By phone I discussed with Ana Bonaca (Harvard) this paper by Auge et al, which (very sensibly) looks at the possibility of doing asteroseismology from the ground. My loyal reader knows that this is something I have been thinking about for a long time (I think it is mentioned in the shouty slide deck linked to from this old blog post), and Bonaca has too. Auge et al show that they can get nu-max from the ground for very luminous (and hence large, and hence large-amplitude-oscillating, and hence long-period) giant stars. Can they also get delta-nu? They say that it is rarely possible. But their tool is something like a Fourier transform followed by searching for peaks. If your main goal was to determine delta-nu, this would not be the tool of choice, I think. Not that I have a tool ready! Bonaca and I resolved to take a look at this problem.

2020-03-23

modeling the Milky Way disk

[Okay now I'm really trying to get back to blogging my research. If you out there are having trouble getting things done: I hear you. I have done very little in this time of quarantine, and I'm trying to be kind to myself about it (and not always succeeding). Take care of yourselves out there.]

The only research I did today was a couple of phone calls. The first was with Christina Eilers (MIT) about determining and implementing a selection function for the APOGEE spectroscopic survey, and using it to measure the scale length of the disk. This is a bit of a boring project! But it would lead to lots of follow-on projects. A good selection function makes you very powerful! For example, the spiral structure we see in kinematics would become abundantly clear in stellar density if we had a selection function and an azimuthally-averaged mean model for the disk.

The other phone call was with Jason Hunt (Flatiron). He has a medium-term goal of applying the made-to-measure method of modeling stellar systems to the entirety of the ESA Gaia data set. I love that goal! We discussed changes to M2M to let it be more responsible with noisy and incomplete data. We resolved that Hunt would teach me M2M in our next (remote) meeting.

2020-03-17

visualizing the kinematics of the disk

My only research today was reading and signing off on a paper by Jason Hunt (Flatiron) about the kinematics of stars in the Milky Way disk. His innovation was to plot the stars in something akin to action-angle coordinates (guiding-center-position coordinates). It's a good space to look at spiral structure and other velocity substructure. And to compare to simulations. The visualizations in the paper are lovely.

2020-03-16

nothing; recovery?

[I've been in and out for many weeks dealing with family crises. Hence the interruption in a blog that has gone essentially uninterrupted since January 2005. Indeed, the interruption made me miss this blog's 15th birthday party on 2020 January 27. I hope to re-start this week. But posting might be spotty, because I'm still in recovery from those crises. Take care of yourselves out there.]

ps. I got nothing done today. But I'm being kind to myself about it!

2020-02-27

writing about kinematics

My only research today was a tiny bit of writing in Christina Eilers's paper on kinematically measured spiral structure in the Milky Way disk.

2020-02-16

LVM self-calibration

I read and commented on some documents today related to the calibration of the Local Volume Mapper part of the SDSS-V family of projects. The project is an intensity-mapping project to observe the interstellar medium in the Milky Way and nearby galaxies, using one spectrograph but many different telescopes (with different apertures). It's clever! The question is: Does this project need calibration telescopes in addition to the science telescope? My position is that they don't. Well, calibration telescopes might be very useful for debugging things and understanding things! But at the end of the day, calibration will be self-calibration I bet. I'm offering very good odds.

One point is the following: When you have an imager or a spectrographic imager, you have to calibrate so that every exposure has calibration consistent with every other exposure, and every pixel has calibration consistent with every other pixel. Good! Now imagine you introduce a calibration telescope. Now you have to do the same for the calibration system, and you have to understand the cross-calibration between the systems (science and calibration). So it greatly increases the difficulty of the task, introduces new variables, and (usually) reduces precision of the final results. The self-consistency of the science data (provided that it is properly taken) is always the strongest constraint on calibration. See, for example, Planck, WMAP, SDSS, PanSTARRS, and so on.

2020-02-15

a survey to support EPRV target selection

In a very low research day, Megan Bedell (Flatiron) and I discussed proposals for inter-disciplinary and inter-group spectroscopic surveys of very bright stars. These would give general information about abundances, binarity, activity, variability, and suitability for further study (by, say, extreme precision radial-velocity projects). She has been thinking about target selection, and we discussed ways to make it very very simple. My position (as my loyal reader knows) is that it is better to be simple and somewhat inefficient than it is to be complex and very efficient. For legacy value, anyway, which is the whole point of a survey like this.

2020-02-14

Is the snail two snails?

Friday mornings in NYC usually start with a free-form meeting on the 11th floor of Flatiron. Today Spergel, Johnston, Gandhi, and Price-Whelan were all at the table. We began by discussing some of the accomplishments that have set the tone and agenda of the data-group and dynamics-group activities at Flatiron. Then we started to discuss what I call The Snail: The phase spiral found by Antoja in the ESA Gaia DR2 data. As my loyal reader knows, we are trying to use it to infer the dynamical properties of the Milky Way disk. And we would also like to use it to infer things about events in the recent past of the Milky Way. We discussed the possibility (suggested by the observations) that The Snail is not just one event but really two. It looks different when you look at stars with different angular momenta (different guiding centers, and hence different histories in their orbits around the Milky Way). In general the question is: Do Snails created in simulated galaxies look anything like the Snail we have?

2020-02-13

EURion

It was a fun morning with Zach Martin (NYU) and Teresa Huang (NYU), talking adversarial attacks against astronomical machine-learning methods. In this context I mentioned the EURion. I said that money can't be photocopied. They didn't believe me. Am I a conspiracy theorist? Yes! But I'm right on this one. We went to the photocopier room and demonstrated the awesome that is secret agreements between electrostatic hardware companies and Western governments. But then Martin said “They went through all that trouble to stop what? Who photocopies money? What kind of a stupid scam is that?”