Course Activities |
|||

Week 1-2 (August 24-Sep 6) |
Read Chapter 1: Overview of Data Mining |
Lecture 1: Introduction | |

Get familiar with R and RStudio | R Intro, RStudio Intro | ||

Supplementary Reading: Data mining and statistics:
what is the connection? Friedman (1997) |
Homework 1
PDF,
LaTex. Assigned on August 25, due on Sep 8. |
||

Week 3 (Sep 7 - Sep 13) |
Read Chapter 2: Theory of Supervised Learning |
Lecture 2: Statistical Decision Theory (I) | |

Lecture 3: Statistical Decision Theory (II) | |||

Homework 2 PDF, Latex. Assigned on Sep 10, due on Sep 29.
| |||

Week 4 (Sep 14 - 20 ) |
Read Chapter 4.2-4.4: Linear Classification Methods for Binary Problems |
Lecture 4: Binary Classification (I): Basics | |

Week 5 (Sep 21 - Sep 27) |
Supplementary Reading: Choosing Between Logistic Regression and Discriminant Analysis, Press, S. and Wilson, S. (1978) |
Lecture 5: Binary Classification (II): Logistic Regression and Discriminant Analysis | |

Week 6 (Sep 28 - Oct 4) |
Curse of Dimensionality; Linear Binary Classification for High Dimensional Problems |
Lecture 6: Binary Classification (III): Extension to High Dimensional Classification Problems | |

Homework 3 PDF file,
Latex File. Assigned on Sep 29, due on Oct 13. |
|||

Homework 3 Solution PDF file,
R code |
|||

Week 7 (Oct 5 - 11) |
Read Chapter 4.1: Nonparametric Regression |
Lecture 6.2: Parametric vs Nonparametric Regression | |

Read Chapter 4.1: Nonlinear Classification Methods |
Lecture 7: K nearest neighbor (Knn) methods | ||

Week 8 (Oct 12 - 18) |
Topic: Introduction to Multiclass Classification |
Lecture 8: Multiclass Classification | |

Supplementary Reading: Diagnosis of multiple cancer types by shrunkencentroids of gene expression | Homework 4 PDF file, Latex File. Assigned on Oct
12, due on Nov 3. |
||

Homework 4 Solution PDF file, Rcode |
|||

Week 9 (Oct 19 - 25) |
Supplementary Reading: Leave-out-one Cross Validation | Lecture 9: Model Selection and Assessment | |

Read Chapter 3: Linear Regression
& Variable Selection |
Lecture 10: Linear Regression and Variable Selection | ||

Supplementary Reading: Linear Model Theory | |||

Week 10 (Oct 26 - Nov 1) |
Read Chapter 3 : Variable Selection for Linear Regression |
Lecture 11: Shrinkage Methods by LASSO | |

Reading: Regression Shrinkage and Selection via the LASSO, | Final Project: Project assigned on Oct 29, due on Dec 16 |
||

Final Project Suggested Reference List | |||

Lecture: Principal Component Analysis: PCA | |||

Lecture: Quadratic Component Analysis: QDA | |||

Week 9-10 (Nov 2 - 15) |
Supplementary Reading: Regularization and variable selection via the elastic net | Lecture 12: Shrinkage Methods - Beyond LASSO | |

Homework 5 PDF file, Latex File. Assigned on Nov 3, due on Nov 19. |
|||

Homework 5 Solution PDF file, R code |
|||

Week 11 (Nov 16 - 22) |
Read Chapter 12: Support Vector Machines |
Lecture 13: Support Vector Machines | |

Supplementary Reading: The Entire Regularization Path for the Support Vector Machine | Homework 6: Latex, PDF, assigned on Nov 19, due on Dec 8 |
||

Multiclass Support Vector Machines | Lecture 14: Multiclass Support Vector Machines | ||

Week 12 (Nov 23 - 29) |
Read Chapter 9 (9.2) : Tree-based Methods |
Lecture 15: Classification and Regression Trees | |

Week 13 (Nov 30 - Dec 6) |
Read Chapter 8.7 : Bootstrap and Bagging |
||

Supplementary Reading: Explaining Adaboost | Lecture 16: Bagging and Boost |