...
Data curation is a set of data management activities that should be considered when developing data management and sharing plans. When planning for data management, it is important to consider the provisions needed to curate data, especially for data that are large in volume, require specialized hardware or software, contain sensitive information (i.e., protected health information (PHI), personally identifiable information (PII)), or are otherwise complex.
Why curate data?
As a disciplined practice, data curation is essential for the long-term accessibility and usability of research data assets. Data curation goes beyond putting data files in a safe place. It not only anticipates inevitable changes in technology that are likely to make it impossible to open a file in a few years' time, but also it is sensitive to changes in how research is done. Even if data are housed in an established data repository, there are no guarantees that the files will be usable or even understandable if the data are insufficiently documented. The video below offers a light-hearted (but very real) portrayal of why data curation should be incorporated into the research process.
...
The data curator
A data curator is responsible for the execution of the data curation workflow in accordance with standards and best practices for long-term preservation and access. The data curator is a professional who has specialized education (typically a graduate degree in information and library science) and training in archival principles and practice, digital preservation, electronic records management, information organization and retrieval, and other related topics. They also have experience working in various research settings to understand how these areas of study are applied to throughout the research lifecycle from project planning to publication.
...
OAIS Reference Model. The Reference Model for an Open Archival Information System (OAIS) is an ISO standard (ISO 14721) that outlines recommend practices for digital archive organizations and the requirements of the systems they support.
Why curate data?
benefits of curating data
RDMC Data Curation Service
what is included
how we assess the scope of work
example
RDMC Data Curation Service
The RDMC Data Curation Service is offered to UNC researchers to assist with data file preparation and transfer to a data repository. Working with the project team, RDMC data curation specialists will design and execute standards-based data curation workflows, which may include file format normalization, document preparation, metadata generation, dataset package quality review, and transfer of dataset packages to the specified repository. These curation activities consider the specific needs of the data according to their type, format, size, and content to help ensure that the data are findable, accessible, interoperable, and reusable (i.e., FAIR), and satisfy all requirements of the data management and sharing plan (DMSP).
Scope of work
The Data Curation Service scope of work for most projects will include the following primary data curation activities:
Dataset file preparation. Assembly and review of dataset files and associated materials to ensure that the data package submitted to the designated repository upholds FAIR data principles (findability, accessibility, interoperability, and reusability)
Dataset record creation. Creation of standardized descriptive and administrative metadata based on client-provided information about the data types, allowable uses, and research context.
Dataset file transfer. Upload of dataset files to a dataset record. Includes creation of file-level metadata, checksums review, application of access restrictions (if required), and inspection of files in the repository to confirm successful transfer.
Requesting the Data Curation Service
Requests for the RDMC Data Curation Service can be submitted via the RDMC Services website.
Detailed information about your project allows the RDMC team to determine what services you need and accurately estimate costs based on those needs. Detailed information about the funding sponsor and their data management and sharing policies allows the RDMC team to recommend services that the sponsor considers allowable costs and that meet the sponsor’s specific data management and sharing requirements.
When submitting requests for the Data Curation Service, please be prepared to provide the following information:
Program solicitation URL
Data sharing policy URL
Project proposal draft
Data management and sharing plan
Project start and end date
Estimated total project budget
Estimated budget for data curation
Once the RDMC team reviews submitted materials, you will receive a quote for the Data Curation Service for a recommended scope of work. Please note that this quote is offered as a broad-brush scope of work estimate based on several assumptions about project needs and timelines. If these assumptions are inaccurate or if the scope of work changes, the associated fees will be updated.
Budgeting for data curation
Data curation is considered by federal funding agencies to be an allowable cost that can be included in the proposal budget. Costs for data curation varies based on several factors including the type, volume, complexity, and sensitivity of the data.
Data Curation Service fees
Data Curation Service fees are based on hourly recharge rates approved by the UNC Office of Sponsored Programs. Please visit the RDMC Add-on Services webpage for current rates and hourly minimums.
Budget justification
...
The text below describing the RDMC Data Curation Service may be used in the budget justification component of the project budget.
The RDMC Data Curation Service Fee covers the cost of executing standards-based data curation workflows for data file packaging and repository ingest, which includes file format normalization, document preparation, metadata generation, dataset package quality review, and transfer of dataset packages to the specified repository. These curation activities are carried out by RDMC data curation specialists with the specific needs of the data type, format, size, and content in mind to ensure that the data are findable, accessible, interoperable, and reusable (i.e., FAIR), and to satisfy all DMSP requirements.