Open Source in Science
We believe in open software standards, open source licensing and open development processes.
We do science to discover knowledge and to improve the human condition. Computers help by enabling quantitative research:
- Data provenance - recording what we did and how we did it
- Knowledge transfer - explaining the research to someone else
- Reproducibility - verifying or invalidating the research of others
To fully achieve these ideals, open science practices are necessary. Given the complexity of 21st century science, full details of the research methods must be open to public scrutiny for there to be any hope of achieving scientific reproducibility.
Lessons from software
"When in doubt, make it public."
—Jeff Atwood (co-creator of Stack Overflow)
Science can learn many lessons from the recent evolution of computing toward the Internet:
- Wikipedia: a public encyclopedia
- Stack Exchange: a public knowledge base
- GitHub: a public source code repository
- Facebook: a public social network
- Google: a searchable database of public online content
The utter success of such tools serves to illustrate the power of public-facing resources and information.
Requirements for open scientific software
"An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship."
—David Donoho, "Wavelab and Reproducible Research," 1995
Analogously in science, a full research publication includes:
- Protocols and methodology
- Raw data
- Computer code
In particular, reproducibility demands the source code of any software used to process the data. It turns out it is good for your career anyway:
"Papers describing software published as open source are amongst the most widely cited publications (e.g., BLAST, and Clustal-W), suggesting many scientific studies may not have been possible without some kind of open software to collect observations, analyze data, or present results."
—Andreas Prlić & James Procter
Ten Simple Rules for the Open Development of Scientific Software
The Science Code Manifesto concisely summarizes the requirements for scientific software source code:
|Code||All source code written specifically to process data for a published paper must be available to the reviewers and readers of the paper.|
|Copyright||The copyright ownership and license of any released source code must be clearly stated.|
|Citation||Researchers who use or adapt science source code in their research must credit the code’s creators in resulting publications.|
|Credit||Software contributions must be included in systems of scientific assessment, credit, and recognition.|
|Curation||Source code must remain available, linked to related materials, for the useful lifetime of the publication.|
"Nobody is entitled to demand technical support for freely provided code: if the feedback is unhelpful, ignore it."
—N. Barnes, Publish your computer code: it is good enough
Some top reasons people cite for not sharing their scientific source code include:
- Time to document and clean up
- Dealing with questions from users
- Not receiving attribution
Fortunately, thanks to software licenses like the Community Research and Academic Programming License (CRAPL), releasing source code for the purposes of verification and reproducibility need not require any cleanup or support. As for attribution, releasing software code publicly is the best way to establish "prior art" in case infringement occurs, which can happen regardless of whether the source is published.
The CRAPL's stipulations include:
- Permission to validate results
- Disregard any evidence of quality
- No mocking the author
- No support
The CRAPL offers a pragmatic way of saying: "Here is what I did—with no promises whether it will work for you."
Beyond open results: towards an open process
In actuality, both science and software are not results, but continuous processes. The best and most enduring scientific software is built to grow a community using an open software development process:
- Improve software as a worldwide community
- Open access resources, including:
- Open revision history (e.g., Git)
- Open mailing lists and/or forums
- Open issue tracker
- Open project roadmap
- Open contribution mechanism (e.g., GitHub pull requests)
- Responsive, reliable maintainers
- Powerful collaboration tools like GitHub
- A focus on interoperability and extensibility
All of LOCI's software efforts are driven with the above goals in mind.
Other aspects of an open scientific process may include:
- Open access journals, to make research results truly publicly accessible [further reading]
- Open peer review, to increase editorial transparency [further reading]
Of course, there are new challenges when taking an open process so far, but we believe the benefits of doing so are well worth it.
The benefits of standardization
One key way of achieving software interoperability is the adoption, reuse and (where necessary) definition of software standards. Such adoption is of massive benefit to both individual scientists and commercial and academic organizations. A brief article on the benefits of standardization from
thinkstandards.net (now offline) provides an excellent summary:
An extensive study initiated by DIN (German Standards Institute) and the German Federal Ministry of Economic Affairs and Technology in 1997 was completed in May 2000. The study provides detailed insight into the economic benefits for standards—to businesses and to the economy. Highlights of the study include:
- Standards contribute more to economic growth than patents and licenses
- Standards play a strategic significance to companies
- Companies that participate actively in standards work have a head start on their competitors in adapting to market demands
- Research risks and development costs are reduced for companies contributing to the standardization process
- Business that are actively involved in standards work more frequently reap short and long term benefits with regard to costs and competitive status than those who do not participate
- Participating in standards development enables one to anticipate technology standardization thereby facilitating one's products progress simultaneously with technology
- Leaders in technology should become more involved in standards
- Standards are a positive stimulus for innovation
- Standards are internationally respected
You can read DIN's full publication, Economic Benefits of Standardization, in PDF format.
Many other articles have been published by various organizations documenting the advantages of standardization: