Hogg's Research: public release of data

2006-11-13

public release of data

Productive and lively discussion at coffee today (Ben Weiner visiting from NOAO, Zabludoff, Moustakas, Blanton, myself) about good policies for the journals and large surveys to adopt in regards the public release of data and code.

My vision (with which no-one agreed) is that the journals should require all papers be released with a tarball that contains all the data and code, such that a reader can unpack the tarball, type make, and produce all of the analyses and figures for that paper! Of course there are implementation issues, but if we want to stay clamped to the manifold of repeatable science (Sam Roweis) we have no other options. I could write about this for hours, but I have an NSF proposal due. I hope to return to this later, because in the course of the discussion I discovered many new reasons to support my view.

We are very far from my vision right now, and it is hard to get there incrementally.

4 comments:

Hogg14 November, 2006 22:53
Now that my NSF proposal is in, I started to write a polemic on this subject here.
ReplyDelete
Replies
Anonymous18 November, 2006 11:47
Hogg - you should set an example and do this
for all your papers. Maybe the rest of the community will follow your example.
ReplyDelete
Replies
Hogg19 November, 2006 01:19
Agreed! We did this for this paper but we didn't know where to host the tarball.
ReplyDelete
Replies
Anonymous23 November, 2006 15:23
There is no way to take "the man" out of the equation. The whole peer review does not stop once the paper comes out, it continues well into the future as the citations are accrued and the follow up research is being done.

Ultimately, some people end up being more trusted by others with their research, and no tarballs can solve the issue. And, naturally, it is a non-starter due to the technical issues involved.
ReplyDelete
Replies

Add comment