Data Science at GI-INSA

Welcome to the webpage for the Data Science course, taught at GI INSA in 2017.

Schedule Please check the schedule weekly, to find the uploaded material and other information.

# Date Lecture TP (Salle Blueberry) Material
1 Oct-10
  • Logistics
  • Introduction to Data Science
  • Frequencies and Probabilities
Lecture 01 Slides (pdf) 
2 Oct-17
  • Supervised learning
  • Classification
    • Least Squares
    • Nearest Neighbors
    • Naive Bayes
Lecture 02 Slides (pdf)
3 Oct-24
  • Numpy, scipy, matplotlib
  • Sklearn: basic setup
4 Nov-07
  • Classification
    • Logistic Regression
    • Decision trees
    • Random forests
Lecture 03 Slides (pdf)
5 Nov-14
  • Regression
    • Linear, Lasso, and Ridge Regression
    • Decision trees
    • Random forests
Lecture 04 Slides (pdf)
6 Nov-21
  • Sklearn – classification
  • Sklearn – regression
7 Nov-28
  • Unsupervised learning
    • Clustering.
    • Dimensionality reduction.
Lecture 05 Slides (pdf)
8 Dec-05
  • Unsupervised Learning: Topic models
  • Model Selection
  • Summary
Lecture 06 Slides (pdf)
9 Dec-12
  • Sklearn: clustering, dimensionality reduction
10 Dec-19
  • Exam

Evaluation There will be one exam at the end, with closed books. Duration: 1h50.

Prerequisites Familiarity with: probabilities and statistics; programming experience in a modern language (we’ll use python, but experience in java, c++, R, or scala should be enough).

Textbooks There is no required textbook for the course. However, you encouraged to consult at least one of the following textbooks if you need more information on a subject.

  • Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Springer series in statistics, Second edition, 2008.
  • Bishop, Christopher M. Pattern recognition and machine learning. Springer, 2006.
  • Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian data analysis. Boca Raton, FL: CRC press; 2014.

IFU Project After the completion of the course, you will work on the following transportation – optimisation project (graded separately): G-Doc Link.

By email:
Use this format for your subject: ‘dsc17 [your topic here]’

Office Hours Mondays 08h30-10h00. Please book appointment by email, first.


Creative Commons License
Material for ‘Data Science’ Course at GI-INSA, 2017 by Michael Mathioudakis is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License