Files corresponding to Short Course: Introduction to Data Science Using R
dcData
and tidyverse
packages. Use thedcData
is installed from GitHub, so it requires an extra
step. - devtools::install_github("mdbeckman/dcData")
dcData
package, read the
BabyNames
data from dcData
into your R environment using the
data()
function as follows:data(BabyNames)
BabyNames
object in the Environment pane
# devtools::install_github("mdbeckman/dcData")
library(tidyverse) # this actually loads a group of packages all at once
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.4 v purrr 0.3.4
## v tibble 3.1.2 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(dcData)
data(BabyNames)
#or
data("BabyNames", package = "dcData")
The file “BabyNameSupp.csv” includes a few years of more recent data to
augment the BabyNames
data. The file can be downloaded straight to R
from the link here:
Read this file in using read_csv()
from the readr
package. Save the
data as an object called BabyNamesSupp
.
Note: Reading the data will produce a warning message! Read the warning message carefully; what seems to have gone wrong?
BabyNamesSupp
BabyNamesSupp <- read_csv("https://github.com/jbpost2/Basics-of-R-for-Data-Science-and-Statistics/raw/master/datasets/BabyNamesSupp.csv")
##
## -- Column specification --------------------------------------------------------
## cols(
## name = col_character(),
## sex = col_logical(),
## count = col_double(),
## year = col_double()
## )
## Warning: 84619 parsing failures.
## row col expected actual file
## 19208 sex 1/0/T/F/TRUE/FALSE M 'https://github.com/jbpost2/Basics-of-R-for-Data-Science-and-Statistics/raw/master/datasets/BabyNamesSupp.csv'
## 19209 sex 1/0/T/F/TRUE/FALSE M 'https://github.com/jbpost2/Basics-of-R-for-Data-Science-and-Statistics/raw/master/datasets/BabyNamesSupp.csv'
## 19210 sex 1/0/T/F/TRUE/FALSE M 'https://github.com/jbpost2/Basics-of-R-for-Data-Science-and-Statistics/raw/master/datasets/BabyNamesSupp.csv'
## 19211 sex 1/0/T/F/TRUE/FALSE M 'https://github.com/jbpost2/Basics-of-R-for-Data-Science-and-Statistics/raw/master/datasets/BabyNamesSupp.csv'
## 19212 sex 1/0/T/F/TRUE/FALSE M 'https://github.com/jbpost2/Basics-of-R-for-Data-Science-and-Statistics/raw/master/datasets/BabyNamesSupp.csv'
## ..... ... .................. ...... ..............................................................................................................
## See problems(...) for more details.