Recommended Preservation File Formats for Qualitative Data

This article offers recommendations for formatting qualitative data that align with best practices and standards.

Formatting Qualitative Data

File formats matter. The formatting of your qualitative data impacts use and accessibility, in the present and in the future. The formatting of your data refers to the structure of information, so it can be understandable by machines and humans. Some formats have better long-term and broader use than other formats. Have you tried to open a file created over 10 years ago? If you have, you probably encountered challenges in finding software that will open and display the information in the file. 

Best practices for data management and sharing recommend using preservation file formats and/or commonly-used formats in your research field. This guidance will focus on preservation file formats for qualitative data.

If you use a Computer Assisted Qualitative Data Analysis Software (CAQDAS), many of the popular programs comply with the REFI-QDA standard for interoperability between qualitative analysis software programs. This means that users can transfer data projects between REFI-QDA-compliant software programs (i.e., NVIvo into ATLAS.ti). A few examples of REFI-QDA-compliant programs are ATLAS.ti, MaxQDA, NVivo, and Dedoose.

The table of file format recommendations covers textual, image, audio, video, and social media data as well as data projects created in a CAQDAS, organized by primary and secondary categories. The primary recommendations are formats that will support long-term use, while the secondary recommendations are formats that are likely to support medium-term use but might need to be migrated for long-term use. As you plan for data formatting, both primary and secondary recommendations are good choices for your data.

Recommended File Formats for Qualitative Data Types

 Qualitative Data Types

Primary Recommendations

Secondary Recommendations

Word processing

.pdf/a

.pdf

.rtf

Text

.txt

 

Structured text (markup)

.xml

.xhtml or .html

.dtd

.tex (LaTex)

Image

.tif

.jp2 (JPEG2000)

.png

.svg (Scalable vector graphs)

.gif

Video

.avi

.mov

.mpg, .mpeg (MPEG-2)

.mp4

Audio

.bwf

.wav

 

.mp3

.ogg

Social media (e.g., Twitter)

.csv, .txt, .json (open, non-proprietary format)

.WARC (Web Archive)

.ARC (Archive)

CAQDAS Computer Assisted Qualitative Data Analysis Software (e.g., ATLAS.ti, NVivo)

.qdpx (REFI-QDA project)

.qdc (REFI-QDA codebook)

 

Export as proprietary format along with export as common data format (.rtf, .txt)

References

Library of Congress. (2023). Library of Congress Recommended Formats Statement (Table of Contents). https://www.loc.gov/preservation/resources/rfs/TOC.html

Qualitative Data Repository. (2023). Formatting Data. https://qdr.syr.edu/guidance/managing/formatting-data

Smithsonian Institution Archives. (2023). Recommended Preservation Formats for Electronic Records. https://siarchives.si.edu/what-we-do/digital-curation/recommended-preservation-formats-electronic-records

 

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

 

RDM Guidance formatting was influenced by The Writing Center, University of North Carolina at Chapel Hill Tips & Tools handouts.

Â