2021-02-05

more data, less good answer

I brought up the following issue at group meeting: When Lily Zhao (Yale) looks at how well spectral shape changes predict radial-velocity offsets (in simulated spectroscopic data from a rotating star with time-dependent star spots), she finds that there are small segments of data that predict the radial velocity offsets better than the whole data set does. That is, if you start with a good, small segment, and add data, your predictions get worse. Add data, do worse! This shouldn't be.

Of course whenever this happens it means there is something wrong with the model. But what to do to diagnose this and fix it? Most of the crowd was in support of what I might call “feature engineering”, in which we identify the best spectral regions and just use those. I don't like that solution, but it's easier to implement than a full shake-down of the model assumptions.

No comments:

Post a Comment