class: center, middle, inverse, title-slide # rbtl - Research Beyond the Lab ## Reference management with Zotero ### Lars Schöbitz ### 2022-04-28 --- <script async defer data-domain="rbtl-fs22.github.io/website" src="https://plausible.io/js/plausible.js"></script> # Today 1. Homework Assignment 2 1. Reference Management - Zotero 1. Open Source - Licenses 1. Reproducible Research 1. Homework Assignment 3 --- # Homework Assignment 2 ```r dat_in_sum_day <- dat_in %>% filter(value <= 1000) %>% mutate(date = as_date(date_time)) %>% group_by(date, location, indicator) %>% summarise(min = min(value), median = median(value), mean = mean(value), sd = sd(value), max = max(value)) ``` --- # Homework Assignment 2 ```r *dat_in_sum_day <- dat_in %>% filter(value <= 1000) %>% mutate(date = as_date(date_time)) %>% group_by(date, location, indicator) %>% summarise(min = min(value), median = median(value), mean = mean(value), sd = sd(value), max = max(value)) ``` - Objects that store dataframes: `dat_in` and `dat_in_sum_day` --- # Homework Assignment 2 ```r dat_in_sum_day <- dat_in %>% * filter(value <= 1000) %>% * mutate(date = as_date(date_time)) %>% * group_by(date, location, indicator) %>% * summarise(min = min(value), * median = median(value), * mean = mean(value), * sd = sd(value), * max = max(value)) ``` - Objects that store dataframes: `dat_in` and `dat_in_sum_day` - Functions: `filter()`, `mutate()`,`as_date()`, `group_by()`, `summarise()`, etc. --- # Homework Assignment 2 ```r dat_in_sum_day `<-` dat_in %>% filter(value <= 1000) %>% mutate(date = as_date(date_time)) %>% group_by(date, location, indicator) %>% summarise(min = min(value), median = median(value), mean = mean(value), sd = sd(value), max = max(value)) ``` - Objects that store dataframes: `dat_in` and `dat_in_sum_day` - Functions: `filter()`, `mutate()`,`as_date()`, `group_by()`, `summarise()`, etc. - Assignment operator: `<-` --- # Homework Assignment 2 ```r dat_in_sum_day <- dat_in `%>%` filter(value <= 1000) `%>%` mutate(date = as_date(date_time)) `%>%` group_by(date, location, indicator) `%>%` summarise(min = min(value), median = median(value), mean = mean(value), sd = sd(value), max = max(value)) ``` - Objects that store dataframes: `dat_in` and `dat_in_sum_day` - Functions: `filter()`, `mutate()`,`as_date()`, `group_by()`, `summarise()`, etc. - Assignment operator: `<-` - Pipe operators: `%>%` --- # Homework Assignment 2 - Imported raw data ```r dat_link <- "https://raw.githubusercontent.com/Global-Health-Engineering/manuscript-hospital-air-quality/main/data/intermediate/malawi-hospitals-air-quality.csv" dat_in <- read_csv(dat_link) dat_in ``` ``` # A tibble: 203,806 × 6 date_time id location indicator value unit <dttm> <chr> <chr> <chr> <dbl> <chr> 1 2019-10-08 13:59:01 hos1 guardian pm2.5 19.4 uq_m3 2 2019-10-08 13:59:01 hos1 guardian pm10 27 uq_m3 3 2019-10-08 14:04:41 hos1 guardian pm2.5 44.9 uq_m3 4 2019-10-08 14:04:41 hos1 guardian pm10 56.7 uq_m3 5 2019-10-08 14:10:21 hos1 guardian pm2.5 202. uq_m3 6 2019-10-08 14:10:21 hos1 guardian pm10 240. uq_m3 # … with 203,800 more rows ``` --- # Summarised derived data .pull-left[ ```r dat_in_sum_day <- dat_in %>% filter(value <= 1000) %>% mutate(date = as_date(date_time)) %>% group_by(date, location, indicator) %>% summarise(min = min(value), median = median(value), mean = mean(value), sd = sd(value), max = max(value)) ``` ] .pull-right[ ``` # A tibble: 890 × 8 # Groups: date, location [445] date location indicator min median mean sd max <date> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> 1 2019-10-01 Lhouse pm10 22.8 121. 143. 107. 447. 2 2019-10-01 Lhouse pm2.5 12.2 46.6 54.8 41.1 194 3 2019-10-02 Lhouse pm10 24.9 205. 231. 153. 772. 4 2019-10-02 Lhouse pm2.5 12.8 72.6 88.0 65.2 375. 5 2019-10-02 Lions pm10 7.5 15.2 16.1 5.23 30.2 6 2019-10-02 Lions pm2.5 4.6 6 6.62 1.85 12.4 # … with 884 more rows ``` ] --- class: inverse, middle .big[Reference Management] --- # Why? - You will read a lot - You want to stay organized - You don't want to waste your time on formatting --- # Which tool? - Mendeley - EndNote - Zotero - many, many more ??? **Mendeley** 1. Mendeley is owned by Elsevier. 2. It encrypts your database and makes money with your data 3. You can only collaborate with 3 people on one project. **EndNote** 1. EndNote doesn't come free, you need to buy a license. 2. They also used a prioprietary citation style files that could only be opened by EndNote. --- # Which tool? - Mendeley - EndNote - **Zotero** - many, many more ??? **Mendeley** 1. Mendeley is owned by Elsevier. 2. It encrypts your database and makes money with your data 3. You can only collaborate with 3 people on one project. **EndNote** 1. EndNote doesn't come free, you need to buy a license. 2. They also used a prioprietary citation style files that could only be opened by EndNote. --- # Why Zotero? <img src="img/zotero-open-source.png" width="100%" style="display: block; margin: auto;" /> .footnote[[Screenshot taken from zotero.org on 2022-03-03](https://www.zotero.org/)] --- # Zotero is Open Source - Why is that good? - Free Software - Transparent about access to your own data - The source code that Zotero is developed in is public - Commitment to support open software and open standards - Zotero developers helped create the open [Citation Style Langauge (CSL)](https://citationstyles.org/) --- # Open Source - Licenses - [Open Source isn't just code on the internet](https://open-source-for-researchers.github.io/open-source-workshop/01-what-is-open-source) - Use permissive licenses to allow others to reuse, remix and build upon (also for commercial purposes) - Recommended licenses - **Text, slides, images**: [Creative Commons](https://creativecommons.org/about/cclicenses/) (CC0, CC-BY, CC-BY-SA) - **Software**: [MIT License](https://en.wikipedia.org/wiki/MIT_License), [Hippocratic License](https://firstdonoharm.dev/), [Unlicense](https://unlicense.org/) for software - https://tldrlegal.com/ - plain english explanations of licences in bullet form. - https://kbroman.org/steps2rr/pages/licenses.html - Read Karl Broman ??? - CCO: a public dedication tool, which allows creators to give up their copyright and put their works into the worldwide public domain - CC-BY: attribution to the creator. The license allows for commercial use. - CC-BY-SA: attribution. you must license the modified material under identical terms - MIT License: As a permissive license, it puts only very limited restriction on reuse - Hippocratic license: Ethical Source license that specifically prohibits the use of software to violate universal standards of human rights, and embodying the [Ethical Source Principles](https://ethicalsource.dev/principles/). - Unlicense: A template for dedicating your software to the public domain --- class: inverse, middle .big[Open Source != Open Access] ??? - Papers and journal articles that are not paywalled - Paywall removed from either publishers or research institutions/libraries From: https://open-source-for-researchers.github.io/open-source-workshop/01-what-is-open-source --- class: inverse, middle .big[Open Source != Open Data] ??? - Refers to the practice of sharing the data that produces your results - This is especially important if you write code that depends in the data - - the code can’t usually be re-run without it. From: https://open-source-for-researchers.github.io/open-source-workshop/01-what-is-open-source --- https://the-turing-way.netlify.app/_images/reproducibility.jpg class: inverse, middle .large[Open Source (Code) + Open Data = Reproducible Research] --- class: left background-image: url(https://the-turing-way.netlify.app/_images/reproducibility.jpg) background-position: right background-size: contain # Reproducible Research .footnote[The Turing Way Community, & Scriberia. (2021). Illustrations from the Turing Way book dashes. Zenodo. https://doi.org/10.5281/zenodo.5706310] --- class: inverse, middle .big[Homework Assignment] --- # Homework Assignment 3 - Learning Objectives These learning objectives are related to the assignment for this week. - Learners are able to import references to a Zotero group library - Learners can use an exported library from Zotero in Better BibTex Format to generate an automated reference list in an R Markdown file - Learners can edit a file in the Citation File Format (.cff) to add their name to the author list --- # Homework Assignment 3 - Due Date - Complete Assignment 2 before you complete Assignment 3 - Assignment 3, due on 15th March - Readings on Reproducible Research --- class: center, middle # Thanks! 🌻 Slides created via the R packages: [**xaringan**](https://github.com/yihui/xaringan)<br> [gadenbuie/xaringanthemer](https://github.com/gadenbuie/xaringanthemer) The chakra comes from [remark.js](https://remarkjs.com), [**knitr**](http://yihui.name/knitr), and [R Markdown](https://rmarkdown.rstudio.com). Access slides as [PDF on GitHub](https://rbtl-fs22.github.io/website/slides/pt2-d03-reference-management/pt2-d03-reference-management.pdf) All material is licensed under [Creative Commons Attribution Share Alike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/).