Ensembel Models: Bagging and Random Forests

class: center, middle, inverse, title-slide

.title[
# Ensembel Models: Bagging and Random Forests
]
.author[
### Justin Post
]

---

# Recap

Looked at a few common supervised learning models for regression and classification tasks

- MLR, Penalized MLR, & Regression Trees

- Logistic Regression & Classification Trees

Now we'll investigate commonly used *ensemble* methods

---

# Prediction with Tree Based Methods

If we care mostly about prediction not interpretation

- Often use **bootstrapping** to get multiple samples to fit on  
- Can average across many fitted trees  
- Decreases variance over an individual tree fit

---

# Prediction with Tree Based Methods

If we care mostly about prediction not interpretation

- Often use **bootstrapping** to get multiple samples to fit on  
- Can average across many fitted trees  
- Decreases variance over an individual tree fit  
  
Major ensemble tree methods

1. Bagging (boostrap aggregation)  
2. Random Forests (extends idea of bagging - includes bagging as a special case)  
3. Boosting (*slow* training of trees)

---

# Bagging

Bagging = Bootstrap Aggregation - a general method

Bootstrapping

- resample from the data (non-parametric) or a fitted model (parameteric)

- for non-parameteric

+ treats sample as population  
    + resampling done with replacement  
    + can get same observation multiple times  
    
    
---

# Bagging

Bagging = Bootstrap Aggregation - a general method

Bootstrapping

- resample from the data (non-parametric) or a fitted model (parameteric)

- for non-parameteric

+ treats sample as population  
    + resampling done with replacement  
    + can get same observation multiple times  
    
- method or estimation applied to each resample

- traditionally used to obtain standard errors (measures of variability) or construct confidence intervals

---

# Non-Parametric Bootstrapping

---

# Bagging

- Create many bootstrap (re)samples, `$j = 1,..., B$`

---

# Bagging

- Create many bootstrap (re)samples, `$j = 1,..., B$`

- Fit tress to each (re)sample

+ Have `$B$` fitted trees

---

# Bagging

- Create many bootstrap (re)samples, `$j = 1,..., B$`

- Fit tress to each (re)sample

+ Have `$B$` fitted trees

- For a given set of predictor values, find `$\hat{y}$` for each tree

+ Call prediction for a given set of `$x$` values `$\hat{y}^{*j}(x)$`

---

# Bagging

- Create many bootstrap (re)samples, `$j = 1,..., B$`

- Fit tress to each (re)sample

+ Have `$B$` fitted trees

- For a given set of predictor values, find `$\hat{y}$` for each tree

+ Call prediction for a given set of `$x$` values `$\hat{y}^{*j}(x)$`

- Combine the predictions from the trees to create the final prediction!

+ For regression trees, usually use the average of the predictions
    
    `$$\hat{y}(x) = \frac{1}{B}\sum_{j=1}^{B} \hat{y}^{*j}(x)$$`

---

# Bagging

- Create many bootstrap (re)samples, `$j = 1,..., B$`

- Fit tress to each (re)sample

+ Have `$B$` fitted trees

- For a given set of predictor values, find `$\hat{y}$` for each tree

+ Call prediction for a given set of `$x$` values `$\hat{y}^{*j}(x)$`

- Combine the predictions from the trees to create the final prediction!

+ For classification trees, usually use the **majority vote** 
    
    `$$\mbox{Use most common prediction made by all bootstrap trees}$$`

---

layout: false