Data Curation is an Advanced Data Management and Curation service offered by the UNC Research Data Management Core (RDMC). This RDMC Add-on Data Management Service is designed to ensure that datasets housed in data repositories are discoverable, understandable, and usable. Members of the RDMC research data stewardship team work with investigators and project teams to develop and execute data curation workflows for data file packaging and repository ingest that align with standards and best practices for data preservation and data quality.
To learn more about the RDMC Data Curation service, submit a request via the RDMC Services website.
What is data curation?
Data curation refers to the various processes involved in ensuring that data can be discovered, accessed, understood, and used now and into the future. These processes are part of a disciplined practice that considers the technical aspects of data as an object for long-term archival preservation and access--but within the broader context of responsible and ethical conduct of research, scientific rigor and integrity, disciplinary culture and practice, and stakeholder mandates and expectations.
The practice of data curation comprises an exhaustive list of activities that are applied to dataset files based on data type, file format, domain, original and intended uses, and other important factors that determine how data files should be organized, documented, archived, and shared in a trustworthy data repository. The diagram below illustrates a typical data curation workflow.
Data curation is a set of data management activities that should be considered when developing data management and sharing plans. When planning for data management, it is important to consider the provisions needed to curate data, especially for data that are large in volume, require specialized hardware or software, contain sensitive information (i.e., protected health information (PHI), personally identifiable information (PII)), or are otherwise complex.
The data curator
A data curator is responsible for the execution of the data curation workflow in accordance with standards and best practices for long-term preservation and access. The data curator is a professional who has specialized education (typically a graduate degree in information and library science) and training in archival principles and practice, digital preservation, electronic records management, information organization and retrieval, and other related topics. They also have experience working in various research settings to understand how these areas of study are applied to throughout the research lifecycle from project planning to publication.
When planning for data management and sharing, a data curator will consider several aspects of the data within the context of the responsible conduct of research. Along with technical requirements for data handling and archival storage, data curators take into account data collection and analysis methods, informed consent requirements, standardized scientific metadata schemas, data quality control protocols, potential reuses of the data, and other critical matters for proper data management.
Please visit the RDMC Services website for more information on how RDMC data curators can support your research project.
Data curation standards and best practices
The RDMC Data Curation service was designed to align with prevailing standards and best practices for research data management and sharing including the ones listed below.
Data Curation Lifecycle Model. The lifecycle model emphasizes the holistic nature of data curation practice and its application throughout the research process from project conceptualization to data sharing.
CURATE(D). The Data Curation Network, which is a membership organization of data repositories, developed a checklist of standard data curation steps for publishing high-quality data.
FAIR Principles. Originally published in Science in 2016, the FAIR Principles for findable, accessible, interoperable, reusable data have been promoted by major funding agencies as a set of guidelines for ensuring access to scientific data.
10 Things for Curating Reproducible and FAIR Research. This document describes key issues for data curation practices that aim to ensure that the published findings of quantitative data-driven research can be computationally reproduced.
OAIS Reference Model. The Reference Model for an Open Archival Information System (OAIS) is an ISO standard (ISO 14721) that outlines recommend practices for digital archive organizations and the requirements of the systems they support.
Why curate data?
benefits of curating data
RDMC Data Curation Service
what is included
how we assess the scope of work
example
Budgeting for data curation
Data curation is considered by federal funding agencies to be an allowable cost that can be included in the proposal budget.
Budget justification
budget justification boilerplate