Introduction
Research Data Management has become increasingly important and is a
signifcant aspect of open science and reproducible research. Most
funders require a data management plan today (e.g. Swiss National
Science Foundation). There are many elements that are a part of it and
we will focus our efforts on documentation and metadata. You will
receive templates that you can use for your own future data management
practices.
Prerequisites
We assume that you have:
- A working account for Select Survey
- Access to a spreadsheet based software (if not, consider LibreOffice)
Learning Objectives
- Learners can use a tool for survey design for 12 to 15 questions
using at least two elements of survey logic
- Learners can apply 12 principles for data organisation in
spreadsheets in the layout of a collected dataset
- Learners understand the importance of documentation and metadata for
(research) data management
Terminology
Exercises
Exercise 0 - Feedback on
Assignment 4
- Open your Assignment 4 on GitHub.
- Find the issue where you received feedback on Assignment 4.
- Incorporate this feedback into your submission for this Assignment
5.
Exercise 1 - Clone the repo
- Open the rbtl organisation on GitHub and locate your repository for
this assignment (i.e. ae-05-data-management-GITHUB-USERNAME).
- Open the rbtl workspace on rstudio.cloud and clone the your
repository into the workspace.
- Open the project
- Let git know who your are (we will have to do this every time now):
- Edit the details in the following code to your name and email. This
is the name and email that your commits will be associated. with.
- usethis::use_git_config(user.name = “Jane”, user.email = “jane@example.org”)
- copy the line of code to your clipboard (Ctrl + C).
- paste the line of code in the Console and hit ther Return (Enter)
key.
Exercise 2 - Read this paper
- Read: Data
Organization in Spreadsheets
- Summarise what you have learned from reading Data
Organization in Spreadsheets. Add your text to the R Markdown file
in the repository you have cloned to your workspace.
- knit, ommit, add, and push your changes to GitHub.
Exercise 3 - Add your data
template
If you are using a survey as your data collection
tool:
- Prepare the questionnaire on Select Survey.
- Answer the questionnaire at least twice on your own.
- Export the responses data as a CSV file.
- Save the exported
UserResponses.csv
file on your
computer.
- Open the rbtl workspace on rstudio.cloud and open your project for
this assignment.
- Upload the exported
UserResponses.csv
file into the
/data
directory (see screenshots below)
- commit, add, and push your changes to GitHub.
If you using experimental data as your data collection
tool:
- Use a spreadsheet based software of your choice and open a new
spreadsheet.
- Write your variables in the first row of your spreadsheet, starting
with the first column (cell A1).
- Save the file as a CSV file on your computer and apply file naming
conventions shared during the lecture.
- Open the rbtl workspace on rstudio.cloud and open your project for
this assignment.
- Upload your created file into the
/data
directory (see
screenshots above).
- commit, add, and push your changes to GitHub.
Exercise 4 - Add
your documentation (codebook)
Independent of whether you are using a survey or experimental data as
your data collection tool:
- Use a spreadsheet based software of your choice and open a new
spreadsheet.
- Copy the variables of the
attributes.csv
file to your
first row (also displayed as an attributes table below)
- Save the file as a CSV file on your computer and and name it
attributes.csv
.
- Open the rbtl workspace on rstudio.cloud and open your project for
this assignment.
- Upload your created file into the
/data
directory and
overwrite the existing attributes.csv
(see screenshots
above).
- commit, add, and push your changes to GitHub.
fileName |
the name of the input data file(s) |
variableName |
the name of the measured variable |
description |
a written description of what that measured variable is |
unitText |
the units the variable was measured in |
variableType |
the variable data type |
- Open the rbtl workspace on rstudio.cloud and open your project for
this assignment.
- Navigate to the
/data
directory in the file manager
(bottom right window) and click on the directory.
- Click on the
DATASETNAME-readme.md
file to open it in
your code editor (top left window).
- Add all information that is applicable and save the file.
- commit, add, and push your changes to GitHub.
Exercise 6 - Complete the
assignment
- knit, commit, add, and push your changes to GitHub.
- Open an issue on the repo to let us know you have completed the
assignment.
Corrections
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Reuse
Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/rbtl-fs22/website, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".