Statement of Originality and Disclosure: A Quick Guide for Authors
We’re continuing our Research Integrity Toolkit blog series with Scholastica In honor of this year’s Peer Review Week theme "Research Integrity: Creating and supporting trust in research." In this post, Gareth Dyke, discusses statements of originality and disclosure when making article submissions for peer review.
Updated on September 19, 2022
To avoid potential ‘disasters’ with data, it's important to know how to manage it. One of the most important aspects of data management is creating an effective data management plan. Once the data is collected in physical or digital form, planning how to store it is crucial. Further planning of what procedures will be applied to the data and how its accessibility will be managed are essential for all research projects.
Always have a plan on how the data should be stored and protected until the research project requires the data to be present. The following are six important data tips for avoiding data disasters:
Data management plan should be documented. The plan should be a defined document containing all the relevant information on how to collect, store, analyze, preserve and protect the data.
Data collection procedures should be defined. This means knowing exactly what collection methods can be considered as applicable for further data management. Validation procedures can be complex and should be well documented. An effective data management plan focuses on the ways to validate that the data is well managed across different steps.
The ability to track different steps and data lineages is essential in a data management plan. All these details will be further expanded in this article.
1. Manage and store your data securely
A data management document should contain a clear plan on how to store the data in a secure location or device to ensure parts - or all - of the data are not lost, accessed, or stolen by unauthorized users. Make secure data storage an official step in the research project. When I say ‘secure’, it relates to making sure any unauthorized data access is prevented. Any sensitive data is well protected.
According to the FAIR principles, any data repositories where data is managed should have a clear definition of data access; login details and authorization procedures should be clearly defined. Regardless of the level of authorization and openness of the data, making sure the principles of accessing the data are very clear to the potential user is one of the most important things.
To have secure data management, check the location where data is stored. If data is stored in an online repository, have a plan on checking the credibility of the repository. If data is to be stored on a physical device, planning for safe storage is essential.
2. Data validation and quality control management
First, have a clear definition of metrics and qualitative characteristics of the data. Further, management plans should have standards on what data can be included in the project.
There are many details which can be used to assess and measure data quality. One of them is data accuracy. This characteristic might depend on the instrument accuracy, standards in making research observations or other technical aspects.
Data accuracy might depend on the level of information stored in the data. One example is the number of decimals for numeric data. Another example is the number of megapixels in scientific images. The more megapixels, the more information and accuracy in the data.
All these data characteristics should be clearly defined in a data management plan.
Technical characteristics are important to make sure data is usable and manageable.
Quality standards make sure the data stored will have enough quality and integrity.
When developing the data management plan, define the steps needed to ensure the data is relevant to the research questions. Data should be stored in its complete form.
Lastly, make sure the dates of the data are stored and compliant with the research goals.
3. Data storage
Your data management document should contain guidelines on making secure backup of the data. Backup data can be stored on a USB drive, the cloud, a CD (less common, but still used), scans, or physical printouts of the data.
Backing up your data is the main protection against data loss. Losing data without some form of backup is one of the most frequent and avoidable mistakes researchers face. Backup every useful part of the data avoiding sensitive data.
Ideally, a portion of the data backup plan should indicate at least two different locations, one of main data and one more for the backup/backups. Multiple backups is much better than one backup in terms of data preservation and safety, but it may require more resources. Adjust the backup procedure according to the available resources like memory available in the storage or financial aspects if applicable.
If both main and backup data are stored in the same location, the risk of data loss increases. Avoid backing up your data on the same locations as the raw data.
Having a portion of the data management plan for the backup is equally important as the main data plan. Backup data will have authorization procedures if it is stored online. Complete the login details section in the data management plan for any backups.
4. Data locations management - plan to keep track of where your data is stored
Keep track of whether you store the data physically or online. Data can be stored in many physical places, like USBs and personal computers. Data can also be stored online on databases and repositories, the cloud, and more.
Documenting where your data is stored is essential. Many different types of data might be included in some projects. It’s easy to forget the data locations.
Consider this strategy a storage location blueprint. It could be as simple as storing data on a laptop or USB device as a backup for research projects.
Create a section in your data management plan. It should define exactly where the data is stored to easily locate and access these storages or databases. The section should include the name of the repository or cloud storage, its URL, and details like maximum storage space, and login details.
5. Management of login details
Make sure to have a plan on how to store any usernames, passwords, and other login information that protects your data. Databases, or even personal computers, often have login procedures. If you forget the login details the data becomes inaccessible.
This is especially important when working with data repositories and large projects. Login credentials can include data managers and other personnel working with data. Not including this in the document can result in lost access which can slow down large projects if it persists.
Further keeping login details contributes to safeguarding the sensitive data and overall data security.
6. Management of data lineage
One of the most important mistakes to avoid: If newer or cleaner versions of your data are created, avoid overwriting the old files.
Why?
Older versions are important references to go back to previous versions of your work to see additions/changes.. Also, your original “raw” data (the first version) contributes to the overall data integrity.
All previous versions not only let you keep track of the version history, but also to the procedures applied. When keeping previous versions of the data, the plan should have a section on how to keep the meta-data for each new version of the data.
Defining this segment is an integral part of the data management plan - the data lineage. It includes all versions from raw to final versions.
Have the following segments in the data managements plan to keep track of the data lineage:
- Raw data
- Versions created after the raw data is stored
- Changes made to the raw data
- Description of methods applied to the data
- Meta-data for each version including the raw data
- Amount of data (eg. samples number) associated with each version
- Memory consumption associated with each version (eg. Megabytes, Gigabytes)
- Dates associated with each version
- Lineage graphic showing change version changes over time
7. Financial aspects of data management
Consider the financial side of data management, especially when working with grants or budgeting in the research project. Significant portion of research finance is related to data management.
Cloud and scientific data repositories often offer free storage or paid for larger projects. Define what amount of storage will be needed and what would be the financial needs of such a project. If the research project does not require much space, there are a large number of free options available and these should be considered as part of the data management plan. On the other hand if the required storage is larger and requires complex maintenance one should try to define all aspects in detail including the financial needs. Having no clear calculations on financial needs for data maintenance can result in project being stopped or even canceled.