My day started with coffee with Paul Ginsparg (Cornell), who is the originator of the arXiv. He is also a faculty member in both Information Science and in Physics. We discussed a wide range of things, but we ended up at experiments we could do inside the arXiv, which is not just a project that transformed all of scientific publishing, but which is a huge repository of information about how literature is written and ideas are propagated. We discussed the things that NASA ADS and INSPIRE have that arXiv doesn't, like, for instance, a citation graph and a concordance of different versions of papers. Completely randomly, we ran into Josh Greenberg (Sloan Foundation) at the Ithaca-to-NYC bus, and he agreed that the arXiv is an amazing source of empirical data about how publishing and science works (perhaps not surprisingly!). We tentatively agreed to explore ideas by email and see if anything catches.
On the bus ride home, I built a nucleosynthetic model of the detailed chemical abundances we are getting out of The Cannon. Right now there are various idiotic things about my model: It uses no physics inputs, and it is ridiculously slow. However, it is a skeleton on which we could build an interpretable, physical model of how the stars got their elements. The idea I have in mind is to build data-driven yield “vectors”, but to build them as perturbations on theoretically computed yield vectors, and thereby preserve some aspects of interpretability relative to a truly free model.