Basics of R for Data Science and Statistics

Files corresponding to Short Course: Introduction to Data Science Using R

Basics of R for Data Science and Statistics

Course Summary

This course introduces the powerful and popular R statistical software through the RStudio integrated development environment. R is a fully developed programming language and one of the major platforms for doing data science. This course covers frequently used data structures, importing raw data, common data manipulations, summary statistics, and data visualizations through the suite of packages called the tidyverse.

Why take this course?

R is an extremely versatile programming language that has the capability to fit a fantastic array of statistical and machine learning models, is extremely easy to collaborate with, and has the capacity to easily and widely share your analyses.

Unfortunately, to be able to utilize these vast capabilities we must of course import the data, likely create variables, and subset our data appropriately. We also want to understand and validate our data through summarizations. R can easily handle these tasks in a multitude of ways. However, the flexibility that comes with R also creates a difficult learning environment. There are often many ways to do the same task and it can be overwhelming at first to determine the best methods.

This course will help you to gain a solid foundation in the modern use of R to do the common tasks mentioned above.

Course Outline

The course provides a modern introduction to the R through the extremely popular suite of packages called the tidyverse. A rough outline is given below:

Day 1:

Day 2:

Prerequisites and Requirements

This course will make heavy use of hands-on programming. We’ll generally introduce a topic and then have exercises to practice and explore. As such, participants must bring their own laptop computer that has access to the internet and the ability to install programs and download files. This course assumes a strong working knowledge of computers and, although not required, it would be beneficial to have past experience with the logic of programming and/or executing statistical analyses.