Intermediate Course on Reproducible Research in R for PhD Students & Postdocs – Expanding Your Data Analysis Toolkit
Reproducibility and open scientific practices are increasingly demanded of, and needed by, scientists and researchers in our modern research environments. We increasingly produce larger and more complex amounts of data that often need to be heavily cleaned, reorganized, and processed before it can be analyzed. This data processing phase often consumes the majority of the time spent coding and doing data analysis. Training for this aspect of research has sadly not kept pace with the demand.
We hope to begin addressing this gap in training with this course. Throughout the course we will be using a highly practical approach that revolves around code-along sessions (instructor and learner coding together), hands-on exercises, and group work.
By the end of the course, participants will: have improved their competency in processing and wrangling datasets; have improved their proficiency in using the R statistical computing language; know how to write re-usable and well-documented code; and know how to make modern and reproducible data analysis projects.
- Time & place
- Who can attend?
- Detailed description
- Programme and instructor
- Additional information
TIME & PLACE
Dates: 6-8 June 2023
Place: MBK, Pilestræde 61, Copenhagen, Denmark
WHO CAN ATTEND?
This course is designed a specific way and is ideal for you if:
- You are a researcher, preferably working in the biomedical field (ranging from experimental to epidemiological). Specifically, this course targets those working in diabetes and metabolism.
- You currently or will soon do some quantitative data analysis.
- You either:
- have taken the introduction to Reproducible Research in R course, since this course is a natural extension to that one;
- know a little to a moderate amount of R (or computing in general);
- know how to use R and have some familiarity with the tidyverse and RStudio.
While having these assumptions help to focus the content of the course, if you have an interest in learning R but don’t fit any of the above assumptions, you are still welcome to attend the course! We welcome everyone, that is until the course capacity is reached.
Priority is given to participants employed at Danish institutions and in the Danish life science industry. If the event is overbooked, the DDEA reserves its right to select participants based on the defined requirements and country of employment.
Please note that you are not guaranteed a seat if you do not meet the target group requirements. If the event is overbooked, the DDEA reserves its right to reject participants based on the defined requirements and country of employment.
While the data processing often consumes the majority of the time spent doing coding, there is little to no training and support provided for it. This has led to minimal attention, scrutiny, and rigour in describing, detailing, and reviewing these procedures in studies, and contributes to the systemic lack of code sharing among researchers. This aspect of research is often completely hidden and may likely be the source of many unintentional irreproducible results. With this course we aim to begin addressing this gap in training.
The learning objectives of the course will be to:
- Learn and demonstrate what an open and reproducible data analysis workflow looks like.
- Learn and apply the fundamental tools and skills for conducting a reproducible and modern analysis for a research project.
- Apply programming techniques to process and manage data in a reproducible and well-documented way.
- Learn where to go to get help and to continue learning modern data analysis skills.
The course will enable participants to answer questions such as:
- What does a modern data analysis setup and workflow look like?
- How can I ensure that my data analysis project is reproducible?
- How can I create pipelines that get, process, and clean my data quickly and that works regardless of whether there is one data file or hundreds of data files (i.e. it scales well)?
- How can I write code that is more reproducible, readable, and that can be easily re-used for my future self and for my collaborators and colleagues?
During the course, we will:
- learn how to use R, specifically aimed at the mid-beginner to early-intermediate level
- focus only on the data processing and cleaning stage of a data analysis project
- teach from a reproducible research and open scientific perspective (e.g. by making use of Git)
- be using practical, applied, and hands-on lessons and exercises
And we will not learn:
- the basics of using R and RStudio
- statistics (these are already covered by most university curriculum)
Considering that this is a natural extension of the introductory r-cubed course, this course incorporates tools learned during that course, including basic Git usage as well as use of RStudio R projects. If you do not have familiarity with these tools, you will need to go over the material from the introduction course beforehand (more details about pre-course tasks will be sent out a couple of weeks before the course).
PROGRAMME AND INSTRUCTOR
Luke Johnston, Postdoc, Steno Diabetes Center Aarhus (DK)
Luke Johnston, Team Leader
Steno Diabetes Center Aarhus
7 May 2023
Participants will have to reserve time in their calendar to do pre-course tasks. The course material is available online.
Deadline for completing pre-course tasks: 1 June 2023
Considering that this course is a natural extension of the introductory r-cubed course, this course incorporates tools learned during that course, including basic Git usage as well as use of RStudio R projects. If you do not have familiarity with these tools, you will need to go over the material from the introduction course beforehand.
Bring your own laptop
Make sure to bring your own laptop since the course includes hands-on learning.
The DDEA offers accommodation in Copenhagen to participants living outside of the Greater Copenhagen Area. Please state if you need accommodation 5, 6, or 7 June when you register.
The DDEA organizes Networking Dinners on 6 and 7 June. Please state if you would like to join one or both dinners when you register.
A course certificate will be given to all attending participants on request at the end of the course. Full participation is required to attain 2,3 ECTS points.
Please note that it is free of charge to participate in the course however the DDEA will charge a no-show fee of 1000 DKK if you do not show up and have not unregistered from the course prior to its start.