Sunday, February 27, 2011

SIGMOD Experimental Repeatability Requirements

I was recently pleased to hear about the SIGMOD Experimental Repeatability Requirements. The stated goal is to:
The goal of the repeatability/workability effort is to ensure that SIGMOD papers stand as reliable, referenceable works for future research. The premise is that experimental papers will be most useful when their results have been tested and generalized by objective third parties.
Apparently it was first done for SIGMOD 2008 and has been getting refined since then. This is something that I think has been lacking in computer science research for quite a while. The obvious benefit is that it makes it much harder to fake results or to report something in a paper that is not really what was implemented and tested. The other benefit is to researchers trying to build on or provide a better alternative to the method presented in a given paper. Anyone who has gone through a graduate computer science program will know the pain of trying to figure out exactly what was done in some paper using only the written description. If you are lucky there is clear pseudocode that can be translated into a working program. But then you still have to worry about things such as magic numbers used to tune the system, or whether the differences you are seeing in the results could be due to other factors such as the machine architecture. Being able to grab the code used for the original paper provides a much faster and more accurate basis for comparing it to a new method.

Unfortunately, though it seemed to be a promising idea, I think their implementation is a crock. The first problem is that participation in the repeatability program is optional:
The repeatability & workability process tests that the experiments published in SIGMOD 2011 can be reproduced (repeatability) and possibly extended by modifying some aspects of the experiment design (workability). Authors participate on a voluntary basis, but authors benefit as well:
  • Mention on the repeatability website
  • The ability to run their software on other sites
  • Often, far better documentation for new members of research teams
The second problem is that it does not mean the code will be made available to everyone. I didn't see any mention of archiving on the 2011 description or how I would be able to get the code for a given paper. The SIGMOD 2010 requirements say the following about code archiving:
Participating in the repeatability/workability evaluation does not imply that the code will be archived for everyone to use subsequently. Pending authors' agreement, the code could be uploaded in the SIGMOD PubZone.
If I understand correctly, it means that if an author chooses to participate, then a committee for the conference will attempt to reproduce the experiments for that paper. After that is done, then the code will only be archived and made available if the author chooses. What the hell were they thinking? I would much rather get rid of the committee that tries to reproduce the results and make it a requirement that the full source code and data sets be made available as supplemental material for all papers. The code needs to be out in the open so it can be scrutinized along with the paper by the broader community of researchers. Furthermore, this supplemental material should be available to those doing the initial reviews of the paper to decide whether or not to accept it.

No comments:

Post a Comment