Syllabus
Instructors: Cole Brookson & Emma Atkinson
Workshop Dates/Times: October 07, 14, 21, 28, and November 04, from 13:00 to 15:00 MT
Workshop Location: Online!! Here and ZOOM will be used for this course.
General Description: It has never been more important to be able to quantitatively analyze Biological data. This workshop series is designed to introduce you to programming and working with data in R/RStudio so you can 1) perform essential programming tasks, 2) read and manipulate your own data, and 3) visualize your data and draw appropriate conclusions. We assume no previous programming experience. We will lead you through the basic use of RStudio as an Integrated Development Environment (IDE) through which to use the statistical programming language R, introduce you to object-oriented programming (OOP), and then help you develop the basic skills to work with data in R
Workshop Format: Pre-recorded mini-lectures will be available to give brief overviews of specific topics. Synchronous ZOOM meetings will be held weekly and will consist mostly of live-coded examples and practice problems. The only way to learn how to program is to practice!
Learning Outcomes: By the end of this workshop series we hope for participants to have achieved the following learning outcomes:
- Understand why programming in biology is important in the 2020s.
- Understand basic R usage: read data, clean data, visualize data, general programming knowledge
- Troubleshoot problems with R and biological data - goal is for participants to be comfortable enough to independently progress
- Most importantly, feel confident moving on to gaining more skills and techniques independently
Schedule:
Week 1: Introduction
- Part 1: Set Up, Goals, & Important Topics & Terms
- Why code in biology?
- Why R?
- Approaches to R
- What is this series? What is it not?
- Part 2: Basics of R, OOP, Data types
- Overview of RStudio
- Setting up a script
- Object-oriented Programming (OOP)
- Data types
- Assignment & Indexing
Week 2: Cornerstone Programming Skills
- If-statements
- Loops - Structure & Use
- Concepts of Iterations
- When to Loop?
- Functions - Best Practices
- Formatting Functions
- Making Functions Do Your Work for You
Week 3: Data: Cleaning, Organization
- RProjects,
here()
, reproducible science - What is the data cleaning process?
- How to clean?
- Biologically informed data cleaning
- Editing row names/column names
- Re-formatting dataframes (e.g., long to wide or vice versa)
- Locating errors in large datasets
Week 4: Plotting in R
- Intro to BaseR plotting
- Intro to ggplot()
- Basic rules for visualizing data
- Cornerstone plots
- Line plots
- Scatterplots
- Histograms
Week 5: Wrap-Up Activity
- Complete In-Class Activity
- Collaborate in Small Groups to Solve Problems
- The last session will be an in-class exercise where we will give you a novel dataset, and with our help, you’ll work through a series of questions that mimic the typical data cleaning process that occurs before a statistical analysis of biological data.