I had conversations with Nora Eisner (Flatiron) and Abby Shaum (CUNY) today about how we report the significance of a signal we find in a time series. In particular a periodic signal. It's an old, unsolved problem, with a lot of literature. And various hacks that are popular in the exoplanet community (and binary-star community!). My position is very simple: Since all methods for determining significance are flawed, and since when you fit a signal you have to estimate also an uncertainty on that signal's parameters, the simplest and most basic test of significance is the significance with which you measure the amplitude of the proposed signal. That is, if the amplitude is well measured, the signal is real. Of course there are adversarial data sets I can make where this isn't true! But that's just a restatement of the point that this is an unsolved problem. For deep reasons!
In practice, the biggest problem with this (or, actually, any) approach is how to account for any *other* variability in the data. In particular, it's extremely common for astrophysical time series to exhibit "red" (or, in any case, non-white) noise. Such time series can look amazingly (quasi-)periodic -- even though there is no preferred time-scale in them at all, just a power-law distribution of variability power. Unless this sort of variability is explicitly and carefully accounted for, fits with periodic components are typically always going to favour statistically significant amplitudes, even if there is no such signal.
ReplyDeleteI guess that's all a long-winded way of saying that those "adversarial" data sets are pretty common and scary.....