2025-07-24

how significant is your anomaly?

So imagine that you have a unique data set Y, and in that data set Y you measure a bunch of parameters θ by a bunch of different methods. Then you find, in your favorite analysis, your estimate of one particular parameter is way out of line: All of physics must be wrong! How do you figure out the significance of your result?

If you only ever have data Y, you can't answer this question very satisfactorily: You searched Y for an anomaly, and now you want to test the significance. That's why so many a posteriori anomaly results end up going away: That search probably tested way more hypotheses than you think it did, so any significances should be reduced accordingly.

The best approach is to use only part of your data (somehow) to search, and then use a found anomaly to propose a hypothesis test, and then test that test in the held-out or new data. But that often isn't possible, or it is already too late. But if you can do this, then there is usually a likelihood ratio that is decisive about the significance of the anomaly!

I discussed all these issues today with Kate Storey-Fisher (Stanford) and Abby Williams (Chicago) today, as we are trying to finish a paper on the anomalous amplitude of the kinematic dipole in quasar samples.

No comments:

Post a Comment