Metadata Standards
This article defines metadata and metadata standards for research data, and provides resources to help researchers identify an appropriate metadata standard for their project.
Metadata and Metadata Standards
Metadata is simply data about data. Metadata is the backbone for many search engines, databases, information systems, and libraries, among others. Think about the online library catalog, the metadata on books and articles that it holds enables users to search for a specific title/author, browse the holdings on a topic, and understand how to access an object of interest (i.e., eBook, library location, inter-library loan, etc.). Metadata can help you discover and find the right books or articles for your research. For research data, metadata can be helpful in searching for a known dataset, browsing and discovering unknown data, accessing data, and using and interpreting data.
A metadata standard is a structured way to organize and classify metadata. Usually, a metadata standard for research data includes a set of elements (or fields) that captures information on the research context, data structure and variables, and licensing and terms of use/access. The standard may include a set of required fields, controlled vocabulary, name authority, and/or format such as XML to enable human and machine readability. Utilizing a metadata standard allows you to create metadata about your research data that will be searchable, interoperable, and understandable by humans and machines. It also is a best practice to write standardized metadata describing your data that you plan to share.
Metadata Standards for Research Data
Much like data repositories, the landscape of metadata standards range from discipline-specific needs to general needs. The generalist metadata standards are applicable to multiple disciplines and research data objects. For instance, DublinCore (DC) is used by many data repositories for capturing general information about an object (creator, contributor, publisher, description, etc.). For data citations, DataCite is a standard that enables data identification and minting persistent identifiers such as DOIs. A few metadata standards have been developed for certain data types such as Text Encoding Initiative (TEI) for textual data and Federal Geographic Data Committee (FGDC) and ISO 19115 for geographic information. These standards can be used by many communities.
Disciplinary metadata standards have been developed by a research community to address their needs. The prevalence of disciplinary metadata standards varies across the research enterprise. A few examples of disciplinary standards include Ecological Metadata Standard (EML) for ecology, DarwinCore for biodiversity, and Data Documentation Initiative (DDI) for the social sciences. If you are unsure if your field has its own metadata standard, see the inventories created by Digital Curation Centre and Research Data Alliance metadata resources below.
Tip for Data Sharing
If you plan to share your research data, we recommend using a trusted data repository because the repository will have already selected and implemented multiple relevant metadata standards to support data description, preservation, discovery, access, and use. This takes the burden off you in selecting and learning these standards. Often, the repository will have interfaces or staff to make writing standardized metadata easy for you and ensure your data package complies with standards and best practices. See the guidance on Identifying a Trustworthy Repository.
Resources for Finding Metadata Standards
Research Data Alliance (RDA) Metadata Standards Directory
Digital Curation Centre Disciplinary Metadata
Seeing Standards: A Visualization of the Metadata Universe by Jenn Riley
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
RDM Guidance formatting was influenced by The Writing Center, University of North Carolina at Chapel Hill Tips & Tools handouts.