I spent time today (at the bar!) understanding the data model and directory structure for the raw, uncalibrated APOGEE data. The idea is that I want to do a real-data example for my paper with Casey (Monash) on combining spectra, and I want to get back to the raw inputs. I also might use these spectra for a problem set in my machine-learning class. The code I wrote is all urllib and request and re, because I think it is necessary to read directories to understand the data dependencies in the survey. Is that bad?
Putting aside my concerns: The coolest thing about this project is that the SDSS family of projects (currently SDSS-V) puts absolutely every bit of its data on the web, in raw and reduced form, for re-analysis at any level or stage. That's truly, really, open science. If you don't believe me, check out this this code that spelunks the raw data. It's all just URL requests with no authentication!
No comments:
Post a Comment