After you have completed your project, you will publish your article. Often publishers and funders require the data from your research to be made public as well.
Not all data can be made public. Data may be confidential because of patents or privacy issues prevent you from publishing (sensitive) personal data.
4.1. Why archiving data
“Scientific integrity is a specific standard of conduct associated with the societal position of the researcher. It is about acting in accordance with the values of science, such as truthfulness, honesty and open reporting, even when no one is looking over the researcher’s shoulder.” – KNAW (Royal Netherlands Academy of Arts and Sciences)
Ensuring that a study can be validated and reproduced is an important part of scientific integrity. However, archiving and making data available is not only about preventing sloppy science and preventing fraud, but also about reusing data. Reusing and combining data makes research more effective, enables new research and provides credits to the researcher by being cited.
Ten simple rules for the care and feeding of scientific data:
- Love your data, and help others love it too
- Share your data online, with a permanent identifier
- Conduct science with a particular level of reuse in mind
- Publish workflow as context
- Link your data to your publications as often as possible
- Publish your code (even the small bits)
- Say how you want to get credit
- Foster and use data repositories
- Reward colleagues who share their data properly
- Be a booster for data science
Source and full article: https://arxiv.org/pdf/1401.2134v1.pdf
4.2. Selection of data
Due to space and budget constraints, it is not possible to preserve all data. You must make a selection of the data to be preserve.
Data to be preserved are data that:
- Are likely to be used or reused
- Are unique
- Enrich an open access publication
- Need to be archived because of requirements by funders or the institution
- Are difficult to reproduce
Prerequisites for data preservation are:
- Usable file format
- Sufficient data documentation and metadata
- Consideration of legal and ethical limitations
- Financial considerations
The flowchart on RDNL can be useful in selecting data to be preserved.
4.3. Archiving process
During your project, in most cases you will use the storage facilities offered by your faculty.
At the end of the project, or following a publication, it is advisable to store the data in a midterm repository. Maastricht University Library supports DataverseNL as the midterm storage facility for our institution.
DataverseNL offers storage up to the prescribed ten years after the last publication based on the data or the completion of the study. This is in accordance with UM’s Code of Conduct for RDM. Depending on the discipline the retention period may even be fifteen years and longer.
About DataverseNL
DataverseNL makes it possible to store, share and register research data online, both during the research period and after project. DataverseNL is a shared service between the participating institutions and DANS.
DataverseNL uses the Dataverse software developed by Harvard University, which is used worldwide.
Start by creating a Dataverse account using your institutional login. In order to use DataverseNL, your institutional account must be linked to an existing Dataverse. To link your account to a Dataverse and for further support in using DataversNL please contact the data steward, Information Manager or data manager of your faculty and check the manual. You can also contact RDM support at the University Library.
Persistent identifier
Dataverse provides your dataset with a persistent identifier. Stating this persistent identifier (for example in a publication) is a simple way to increase the findability of your dataset. It is strongly recommended to provide the persistent identifier to the publisher rather than handing over the dataset, which may have implications for the ownership of the data.
Metadata
Metadata are data about data. Assigning metadata means describing a dataset in a way that it is readable and findable by computers. DataverseNL enables you to provide your data with a standardised set of metadata (Dublin Core and DDI). A sufficient and high-quality set of metadata will enhance findability, interoperability and reusability of your data.
Licenses
A license on a dataset describes what someone is allowed to do with the dataset. All new datasets in DataverseNL will receive a CC0 (CC-zero) public domain dedication by default. CC0 facilitates reuse of research data. If you are not able to give your datasets a CC0 waiver, you can either create your own custom Terms of Use or use one of the Creative Commons copyright licenses.
Licenses such as Creative Commons (CC) replace ‘all rights reserved’ copyright with ‘some rights reserved’. There are seven standard CC-licenses. CC-BY is the most widely used license, requiring attribution when using data. Creative Commons offers an easy to use online Chooser to help you decide which license is right for you.
4.4. Long term storage
There are two Dutch repositories for storing youur data for the long term (“eternity”) and makig it permanently available:
- EASY offers sustainable archiving of research data and access to thousands of datasets.
EASY is CoreTrustSeal certified and is hosted by DANS. - 4TU Research Data is a Data Seal of Approval certified data repository focusing on technical, geospatial and engineering data.
Although 4TU Research Data is offered by four Dutch technical universities, it is available for all Dutch universities.