Resources for Data Collection & Management
R Programming
Essential
-
Hadley Wickham wrote a book on using the tidyverse and the online version is FREE. This is a phenomenal resource on using R to import, tidy, and visualize data.
-
Posit Cheat Sheets: help with commands for using the different
tidyverse
packages, RStudio shortcuts and tricks, help with R commands, and more. You definitely want the ones for Data Import, Work with Strings, Factors, Data Transformation, and Base R. -
Where and How to ask for help
- Hadley Wickham’s advice on how to write a good reproducible
example for getting help with R - how to post good questions on StackOverflow
- The UF R-users listserv is very user friendly and a great place to post requests for help.
- Hadley Wickham’s advice on how to write a good reproducible
Tutorials and Books
-
Software Carpentry: Using RStudio for Project Organization & Management
-
Kieran Healy’s Data Visualization: a practical introduction is my favorite introductory (yet super-comprehensive) book on data visualization with R. If you scroll down to the bottom of the page you can download the datasets and code used to make the figures in the book, which makes life much easier.
-
ROpenSci: tools for accessing, manipulating, and visualizing open data
-
Learning R
Swirl
Specific Problems in Data Cleaning and Managemnt
-
Text Mining:
tidytext
package -
Working with Qualtrics survey data with the
qualtRics
package -
Optical Character Recognition (OCR): extract text from images:
tesseract
package -
Extract text & metadata from pdf files:
pdftools
package -
Image processing in R: the
magick
package
Advanced R Packages
-
DataCurator
package: ‘a simple desktop data editor to help describe, validate and share usable open data’. -
RegExr: online tool to learn, build, & test Regular Expressions (RegEx / RegExp)
-
janitor (cleanup of file names, etc.)
-
knitr
overview: reproducible documents with R
Discipline-specific Resources
-
historydata
package: Sample data sets for historians learning R. They include population, institutional, religious, military, and prosopographical data suitable for mapping, quantitative analysis, and network analysis. -
The Programming Historian Website: wide range of topics, from text analysis to OpenRefine
Slide Presentations in R
Data Archives
Text Extraction and Organization
Plan for extraction and organization
Form Design
Data Security
UF Office of Information Security and Compliance
Cyber Safeguards for UF
UF IRB
UF Data Classification Policy
UF Office of Information Security and Compliance