In-class Activities - Week 8
Metadata & Codebooks
1. Review of Other Researchers' Metadata
Data repositories such as Dryad and ICPSR are designed to permanently store the data thatused in research so it is available to future scholars. To use the data, it is important to have good metadata….but how good are the metadata, really?
Go to either Dryad or ICPSR and download 2-3 datasets. Now review the data and metadata. Based on the information provided, could you explain what the abbreviations are? How the data were collected? What the values represent? Could you recreate the analysis? Is there anything missing or that stands out?
- Here is an example from Dryad: The Page “Data from: Resilient networks of ant-plant mutualists in Amazonian forest fragments” includes an overview of the project and dataset, along with some other information. If you click “Download Dataset” you will get a zip file with the data in .csv format and the Metadata in .txt format.
Break
2. Getting Started on your own Metadata
Metadata Templates
Today’s session is an opportunity to start drafting the metadata for your project. Although there are links in the notes for today’s sesssion to tools that will build your metadata in machine-readable XML schema, for this class (and maybe even in most cases) a .txt or .Rmd file with information on the relevant Class Descriptors (sensu Michener et al. 1997) is all you need sufficient.
To save you time, I have created metadata templates based on information from ICSPR (for social sciences) and Michener et al. (for biophysical sciences) that you can download and edit; you can add more fields or delete any that are not relevant. Note that Table 1 in Michener et al. is much more comprehensive and provides additional guidance on how to make sure the metadata are useful.
Workflow:
-
Download the following templates: Click the link for your preferred format (.xtx or .Rmd) and save the file in the RStudio project you’ve created for your Course Project. The .txt version can be opened and edited in any word processor, a text editor, or in R. The .Rmd file is an R Markdown Document.
a. Metadata Template - Social Sciences: .txt format or .Rmd format. This template is from ICPSR; see note below for additional info on Qualitative Data.
b. Metadata template for Biophysical Sciences: .txt format or .Rmd format. This is Table 1 from Michener et al. 1997.
c. Note for researchers in the Humanities or those working with Qualitative Data: The metadata required often depend on the type of material with which you work (e.g., oral history, photos, digital, printed). If your data is in this domain, you can download templates here: Template No.1 is from UF’s Samuel Proctor Oral History Project, Template #2 is a more general one from the UF Humanities Archives. You can also review the metadata required by the Qualitative Data Repository.
-
Choose the template that is most appropriate for your discipline, then review both templates. Is there metadata from the other one would be useful to include in yours? If so, copy the items over and save the revised file with a new (correctly styled) name.
-
Start filling out the metadata requested in the template. You might want to begin by making notations on the ones for which you will have to present the range of possible values, units, the names / brands / models of equipment used to make or record measurments, etc.
-
Submission: NONE. This is a component of the final project, so the goal for today is to jump-start your work and to realize that preparing a good metadata file takes longer than anyone anticipates.
Note: Be sure to check the Notes for today’s topic - they include excellent resources for preparing metadata.
Sources for Today’s Session
-
DataONE Community Engagement & Outreach Working Group (2017) “Metadata Management”. Accessed through the Data Management Skillbuilding Hub at https://dataoneorg.github.io/Education/lessons/07_metadata/index on Aug 31, 2020