Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Data Curation is an Advanced Data Management and Curation service offered by the UNC Research Data Management Core (RDMC). This RDMC Add-on Data Management Service is designed to ensure that datasets housed in data repositories are discoverable, understandable, and usable. Members of the RDMC research data stewardship team work with investigators and project teams to develop and execute data curation workflows for data file packaging and repository ingest that align with standards and best practices for data preservation and data quality.

To learn more about the RDMC Data Curation service, submit a request via the RDMC Services website.

What is data curation?

Data curation refers to the various processes involved in ensuring that data can be discovered, accessed, understood, and used now and into the future. These processes are part of a disciplined practice that considers the technical aspects of data as an object for long-term archival preservation and access--but within the broader context of responsible and ethical conduct of research, scientific rigor and integrity, disciplinary culture and practice, and stakeholder mandates and expectations.

The practice of data curation comprises an exhaustive list of activities that are applied to dataset files based on data type, file format, domain, original and intended uses, and other important factors that determine how data files should be organized, documented, archived, and shared in a trustworthy data repository. The diagram below illustrates a typical data curation workflow.

2024-10-14_14-00-17-20241014-180021.png

Data curation is a set of data management activities that should be considered when developing data management and sharing plans. When planning for data management, it is important to consider the provisions needed to curate data, especially for data that are large in volume, require specialized hardware or software, contain sensitive information (i.e., protected health information (PHI), personally identifiable information (PII)), or are otherwise complex.

Why curate data?

As a disciplined practice, data curation is essential for the long-term accessibility and usability of research data assets. Data curation goes beyond putting data files in a safe place. It not only anticipates inevitable changes in technology that are likely to make it impossible to open a file in a few years' time, but also it is sensitive to changes in how research is done. Even if data are housed in an established data repository, there are no guarantees that the files will be usable or even understandable if the data are insufficiently documented. The video below offers a light-hearted (but very real) portrayal of why data curation should be incorporated into the research process.

https://youtu.be/N2zK3sAtr-4?feature=shared

The data curator

A data curator is responsible for the execution of the data curation workflow in accordance with standards and best practices for long-term preservation and access. The data curator is a professional who has specialized education (typically a graduate degree in information and library science) and training in archival principles and practice, digital preservation, electronic records management, information organization and retrieval, and other related topics. They also have experience working in various research settings to understand how these areas of study are applied to throughout the research lifecycle from project planning to publication.

When planning for data management and sharing, a data curator will consider several aspects of the data within the context of the responsible conduct of research. Along with technical requirements for data handling and archival storage, data curators take into account data collection and analysis methods, informed consent requirements, standardized scientific metadata schemas, data quality control protocols, potential reuses of the data, and other critical matters for proper data management.

Please visit the RDMC Services website for more information on how RDMC data curators can support your research project.

Data curation standards and best practices

The RDMC Data Curation service was designed to align with prevailing standards and best practices for research data management and sharing including the ones listed below.

Data Curation Lifecycle Model. The lifecycle model emphasizes the holistic nature of data curation practice and its application throughout the research process from project conceptualization to data sharing.

CURATE(D). The Data Curation Network, which is a membership organization of data repositories, developed a checklist of standard data curation steps for publishing high-quality data.

FAIR Principles. Originally published in Science in 2016, the FAIR Principles for findable, accessible, interoperable, reusable data have been promoted by major funding agencies as a set of guidelines for ensuring access to scientific data.

10 Things for Curating Reproducible and FAIR Research. This document describes key issues for data curation practices that aim to ensure that the published findings of quantitative data-driven research can be computationally reproduced.

OAIS Reference Model. The Reference Model for an Open Archival Information System (OAIS) is an ISO standard (ISO 14721) that outlines recommend practices for digital archive organizations and the requirements of the systems they support.

RDMC Data Curation Service

The RDMC Data Curation Service is offered to UNC researchers to assist with data file preparation and transfer to a data repository. Working with the project team, RDMC data curation specialists will design and execute standards-based data curation workflows, which may include file format normalization, document preparation, metadata generation, dataset package quality review, and transfer of dataset packages to the specified repository. These curation activities consider the specific needs of the data according to their type, format, size, and content to help ensure that the data are findable, accessible, interoperable, and reusable (i.e., FAIR), and satisfy all requirements of the data management and sharing plan (DMSP).

Scope of work

The Data Curation Service scope of work for most projects will include the following primary data curation activities:

Dataset file preparation. Assembly and review of dataset files and associated materials to ensure that the data package submitted to the designated repository upholds FAIR data principles (findability, accessibility, interoperability, and reusability)

Dataset record creation. Creation of standardized descriptive and administrative metadata based on client-provided information about the data types, allowable uses, and research context.

Dataset file transfer. Upload of dataset files to a dataset record. Includes creation of file-level metadata, checksums review, application of access restrictions (if required), and inspection of files in the repository to confirm successful transfer.

Requesting the Data Curation Service

Requests for the RDMC Data Curation Service can be submitted via the RDMC Services website.

Detailed information about your project allows the RDMC team to determine what services you need and accurately estimate costs based on those needs. Detailed information about the funding sponsor and their data management and sharing policies allows the RDMC team to recommend services that the sponsor considers allowable costs and that meet the sponsor’s specific data management and sharing requirements.

When submitting requests for the Data Curation Service, please be prepared to provide the following information:

  • Program solicitation URL

  • Data sharing policy URL

  • Project proposal draft

  • Data management and sharing plan

  • Project start and end date

  • Estimated total project budget

  • Estimated budget for data curation

Once the RDMC team reviews submitted materials, you will receive a quote for the Data Curation Service for a recommended scope of work. Please note that this quote is offered as a broad-brush scope of work estimate based on several assumptions about project needs and timelines. If these assumptions are inaccurate or if the scope of work changes, the associated fees will be updated.

Budgeting for data curation

Data curation is considered by federal funding agencies to be an allowable cost that can be included in the proposal budget. Costs for data curation varies based on several factors including the type, volume, complexity, and sensitivity of the data.

Data Curation Service fees

Data Curation Service fees are based on hourly recharge rates approved by the UNC Office of Sponsored Programs. Please visit the RDMC Add-on Services webpage for current rates and hourly minimums.

Budget justification

The text below describing the RDMC Data Curation Service may be used in the budget justification component of the project budget.

The RDMC Data Curation Service Fee covers the cost of executing standards-based data curation workflows for data file packaging and repository ingest, which includes file format normalization, document preparation, metadata generation, dataset package quality review, and transfer of dataset packages to the specified repository. These curation activities are carried out by RDMC data curation specialists with the specific needs of the data type, format, size, and content in mind to ensure that the data are findable, accessible, interoperable, and reusable (i.e., FAIR), and to satisfy all DMSP requirements.

  • No labels