Recommended Preservation File Formats for Qualitative Data
This article offers recommendations for formatting qualitative data that align with best practices and standards.
Formatting Qualitative Data
File formats matter. The formatting of your qualitative data impacts use and accessibility, in the present and in the future. The formatting of your data refers to the structure of information, so it can be understandable by machines and humans. Some formats have better long-term and broader use than other formats. Have you tried to open a file created over 10 years ago? If you have, you probably encountered challenges in finding software that will open and display the information in the file.Â
Best practices for data management and sharing recommend using preservation file formats and/or commonly-used formats in your research field. This guidance will focus on preservation file formats for qualitative data.
If you use a Computer Assisted Qualitative Data Analysis Software (CAQDAS), many of the popular programs comply with the REFI-QDA standard for interoperability between qualitative analysis software programs. This means that users can transfer data projects between REFI-QDA-compliant software programs (i.e., NVIvo into ATLAS.ti). A few examples of REFI-QDA-compliant programs are ATLAS.ti, MaxQDA, NVivo, and Dedoose.
The table of file format recommendations covers textual, image, audio, video, and social media data as well as data projects created in a CAQDAS, organized by primary and secondary categories. The primary recommendations are formats that will support long-term use, while the secondary recommendations are formats that are likely to support medium-term use but might need to be migrated for long-term use. As you plan for data formatting, both primary and secondary recommendations are good choices for your data.
Recommended File Formats for Qualitative Data Types
 Qualitative Data Types | Primary Recommendations | Secondary Recommendations |
Word processing | .pdf/a | .rtf |
Text | .txt | Â |
Structured text (markup) | .xml | .xhtml or .html .dtd .tex (LaTex) |
Image | .tif .jp2 (JPEG2000) .png .svg (Scalable vector graphs) | .gif |
Video | .avi .mov .mpg, .mpeg (MPEG-2) | .mp4 |
Audio | .bwf .wav  | .mp3 .ogg |
Social media (e.g., Twitter) | .csv, .txt, .json (open, non-proprietary format) .WARC (Web Archive) | .ARC (Archive) |
CAQDAS Computer Assisted Qualitative Data Analysis Software (e.g., ATLAS.ti, NVivo) | .qdpx (REFI-QDA project) .qdc (REFI-QDA codebook) Â | Export as proprietary format along with export as common data format (.rtf, .txt) |
References
Library of Congress. (2023). Library of Congress Recommended Formats Statement (Table of Contents). https://www.loc.gov/preservation/resources/rfs/TOC.html
Qualitative Data Repository. (2023). Formatting Data. https://qdr.syr.edu/guidance/managing/formatting-data
Smithsonian Institution Archives. (2023). Recommended Preservation Formats for Electronic Records. https://siarchives.si.edu/what-we-do/digital-curation/recommended-preservation-formats-electronic-records
Â
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Â
RDM Guidance formatting was influenced by The Writing Center, University of North Carolina at Chapel Hill Tips & Tools handouts.
Â