Archiving and FAIR principles

Long-term storage of software and data in a package containing all materials necessary for replication of research findings (FAIR)

The wider trend in academia is for research data to be stored according to the FAIR principles: with the aim of being

  • Findable: achieved through storing in a trustworthy digital repository with metadata and a persistent identifier
  • Accessible: as openly accessible as possible, but closed if necessary
  • Interoperable: can be combined with other datasets (is not yet widely in practice)
  • Reusable: accurate and clearly documented

To address the reusability criterium for data, NWO coined the term replication package: the total set of information necessary to replicate a scientific result. This would include for instance the data, documentation, provenance information, (descriptions of) the necessary software and hardware, and the metadata and persistent identifier the data receives when archiving it.

For software the replication package must at least include a README, and/or scripts that explain how to compile/use the software on the data. To make replication easy, to make the package (more) hardware independent, and to be sure that all software dependencies are included, one can distribute it as a virtual machine image. This is done by performing the experiments on a virtual machine, and archive its disk image. Common examples of virtual machine technology are KVM (installed on CWI Linux machines), Xen, VirtualBox and VMware.

The virtual machine image(s) should then be archived at a trustworthy digital repository and registered at the CWI Institutional Repository.




Archive and register directly from GitHub

If your code is on GitHub you can directly archive your repository at Zenodo, our preferred repository. At Zenodo, add your record to the CWI community. After curation we receive it automatically in the CWI Institutional Repository.

Please ensure your Zenodo record has the following metadata:

  • Title
  • Author(s)
  • Description/short summary
  • License (Help with choosing a license)
  • Funding information
  • Reference(s)

You can add this metadata at Zenodo, or at GitHub with a citation.cff file in the default branch of your GitHub repository. Zenodo uses this information automatically. Zenodo will provide a DOI.

If you have more than 50GB to store, you can split it over multiple Zenodo records and refer to them in the metadata. In the CWI repository we will only register the first record, and provide links to the other records. If you have very large data, please contact us for a solution.

Note - Zenodo lets you reserve a DOI: in a New upload, go to 'Basic information' and click 'Reserve DOI'.

Example: Bossema, F.G, & Coban, S.B. (2020). A fifteen-tile tomographic micro-CT dataset of a panel painting "Cadmus sowing dragon's teeth" 1/3. doi:10.5281/zenodo.4334010

Back to Software and Data Management