3.1. Discover and reuse existing data
The research community is moving towards transparent, reproducible and Open Science. As part of this movement, funders, publishers and universities increasingly demand that researchers make their data FAIR.
As more datasets become available, the chances increase that you will find existing datasets useful for your project. Therefore, before you start your project, you should check whether an available dataset is sufficient to conduct your research, or whether your data can be combined with an existing dataset. Reusing datasets can save you time and help you optimise your research design.
There are many sources of datasets. Here are a few examples:
- Open Access Directory: Data repositories
- UK Data Archive
- Data Archiving Network Services (DANS)
- CLARIN centres
- Data portal van de Nederlandse overheid (Dutch Government)
- Harvard Dataverse
In addition, the University Library offers commercial licensed datasets.
3.2. Access existing data
When reusing existing data, it is important to know the legal status of the data. The consent of the data’s author or creator may be required. You may have to deal with copyrights, licenses, fees and charges.
For more information, see the SURF report from the Centre for Intellectual Property Law (CIER).
3.3. Research in progress (dynamic phase)
It is important to consider which file formats to use to store your data (i.e. the way the information is encoded). As far as possible, choose independent (non-proprietary) file formats so that you can reliably use or display the file contents in the future. A list of preferred formats can be found on DANS and 4TU.ResearchData [PDF].
One format can be chosen for data collection and analysis and another format for archiving. After converting the file format, check your data for errors that may have been caused by the export process, e.g. content, metadata, layout or quality loss.
Storage and back-up
When choosing a storage solution for your data, you need to consider backups and security in mind. Regular backups prevent accidental or malicious data loss. Personal or sensitive data require additional measures to keep it safe and are best stored on the institution’s network drives.
UM offers free extra storage space to help researchers store, edit, analyse and share data directly with other UM researchers. By using this storage space, you will have less to worry about in terms of GDPR compliance. A DMP is a requirement to access this storage. Contact the Information Manager of your faculty if you are interested in this extra storage space.
Check the overview of the storage solutions at UM.
When collaborating with colleagues from in- and outside the institution, the exchange of information should take place within a secure environment and on the basis of clear instructions and agreements.
UM offers a range of solutions for secure collaboration, such as SURFdrive, SURFfilesender and Virtual Research Environments (VREs). To start with SURFdrive or SURFfilesender, check the SURFdrive manual (PDF) and the SURFfilesender manual (PDF).
Do not use (‘free’) online storage alternatives such as Dropbox, Google Drive, Box, Hotmail, OneDrive, WeTransfer, Evernote and many others. It is unclear how safe your data is when you store it there. There are even services that require you to transfer intellectual rights to the provider. UM has legal duty to protect (especially) sensitive data and intellectual property should never be transferred to a third party.
Think about how and with whom you share files and adhere to UM policies. Record agreements you have made on data sharing and use. Comply with the legislation and, if applicable, the informed consent of your data subjects. If your project involves personal data, you will need a Data Processing Agreement (DPA) if a third party processes your data. For information on DPA’s, contact the Information Manager of your faculty.
High-risk video interviews
Microsoft Teams is the only service available at UM for high-risk video conferencing, such as interviews with research participants. View the memo for information on when and how to use Microsoft Teams.
To keep your data usable and understandable to yourself and future users, you need to document your data. Data documentation should include information about the context of the data collection, collection methods, data manipulations, different versions, etc.
3.4. Organising your research data
Once you have started collecting, generating and analysing data, you can quickly lose the overview. You will save time and mistakes by structuring files and folders from the beginning of your project. Some practical tips:
- Define a folder structure in advance
- Define logical categories
- Use a naming convention and document this in a README file
- Keep file names clear and short
- Avoid the use of spaces, dots and special characters in file names
- File names must be consistent, meaningful and easy to find
- Store raw data separately, leave it untouched and use a working copy
- Separate data in progress from completed data
- Avoid ambiguous filenames such as Final_1, Final_2
- Use for example YYYYMMDD for the notation of dates in file names and use this notation consistently
- Use major and minor versions like:
- Major versions: v01, v02, v03
- Minor versions: v01_01, v02_02, v03_03