In-class Activities - Week 15
Data, Ethics, & The Law
Before we start…
- an example from our lab of a real data clean-up situation for which OpenRefine is ideal
Part 1: Anonymizing is hard
Exercise 1: Finalize this “anonymization log' for one of your interviews. How would you anonymize each entry?
Interview-Page | Original Info | Changed to |
---|---|---|
Int1 p1 | Age: 27 | “______________” |
Int1 p1 | Born in Venezeula | “______________” |
Int1 p2 | Born on 20 June | “______________” |
Int1 p2 | Name: Miguel | “______________” |
Int1 p3 | Lives in Atlanta | “______________” |
Int1 p4 | Known as ‘Tio Miguelito | “______________” |
Int2 p1 | his friend Maria | “______________” |
Int2 p8 | attended Talbot Elementary School | “______________” |
Int2 p10 | Director, Southern Region, Quantum Vehicles | “______________” |
Int2 p11 | lives in Grant Park | “______________” |
Exercise 2: Download this interview file to your computer. Try anonymizing this text - was it harder or easier than the more abstract list above? Why or why not?
PART 2: Research misconduct related to data
Take 15 minutes to read this article: Martinson, B., Anderson, M. & de Vries, R. 2005. Scientists behaving badly. Nature 435:737–738. link. Then then go to the class Canvas page and tell the class what you thought of the paper. In particular:
- Did the % of scientists engaging in the different behaviors surprise you, and
- Are all the behaviors equally ‘bad’? Are there cases (or disciplinary norms) in which they might be acceptable?