Foundations of Machine Learning (2018/19)

Aug 4, 2019

African Institute for Mathematical Sciences, Kigali, Rwanda

This course was part of the African Masters in Machine Intelligence (AMMI) at the African Institute for Mathematical Sciences (AIMS), Rwanda.

Part 1: Mathematical Foundations

Linear Algebra (MML book chapter 2)
- Groups
- Vector spaces
- Linear independence
- Basis
- Coordinate representation
- Basis change
- Linear mappings
Analytic Geometry (MML book chapter 3)
- Eigenvalues
- Norms and inner products
- Distances and angles
- Orthogonal projections
Vector Calculus (slides, MML book chapter 5)
- Scalar differentiation
- Partial derivatives
- Jacobian
- Chain rule
- Derivatives of matrices w.r.t. matrices
- Gradients in a multi-layer neural network
Statistics and Probability Theory (slides, MML book chapter 6)
- Statistics to describe datasets: means, variances, covariances, medians
- Basic probability distributions: Bernoulli, Binomial, Beta, Gaussian, Gamma
- Parameter estimation (maximum likelihood, MAP estimation)
- Key concepts in probability theory
Optimization (MML book chapter 7)
- Gradient descent
- Stochastic gradient descent
- Momentum
- Constrained optimization

Part 2: Machine Learning

Graphical Models (slides, Chris Bishop’s book chapter)
- Directed graphical models
- Undirected graphical models
- D-separation
Dimensionality Reduction with Principal Component Analysis (slides, MML book chapter 10)
- Maximum variance perspective
- Projection perspective
- Key steps of PCA in practice
- Probabilistic PCA
- Other perspectives of PCA
Linear Regression (slides, MML book chapter 9)
- Maximum likelihood estimation
- Maximum a posteriori estimation
- Bayesian linear regression
- Distribution over functions
Model Selection (slides, MML book chapter 8)
- Cross validation
- Information criteria
- Bayesian model selection
- Occam’s razor and the marginal likelihood
Gaussian Process Regression (slides, GPML book)
- Model
- Inference with Gaussian processes
- Training via evidence maximization
- Model selection
- Interpreting the hyper-parameters
- Practical tips and tricks when working with Gaussian processes
Bayesian Optimization (slides)
- Optimization of meta-parameters in machine learning systems
- Acquisition functions
- Practicalities
- Applications
Sampling (slides)
- Monte Carlo estimation
- Importance sampling
- Rejection sampling
- Metropolis Hastings
- Slice sampling
- Gibbs sampling
Density Estimation with Gaussian Mixture Models (slides, MML book chapter 11)
- Mixture models
- Parameter estimation
- Implementation
- Latent variable perspective
Classification with Logistic Regression (slides)
- Logistic sigmoid and as a posterior class probability
- Implicit modeling assumptions
- Maximum likelihood estimation
- MAP estimation
- Probabilistic model
- Laplace approximation
- Bayesian logistic regression
Information Theory (slides by Pedro Mediano)
- Entropy
- KL divergence
- Mutual information
- Coding theory
- Information theory and statistical inference
Variational Inference (slides)
- Inference as optimization
- Evidence lower bound
- Conditionally conjugate models
- Mean-field variational inference in conditionally conjugate models
- Black-box variational inference for hierarchical Bayesian models
- Gradient estimators
- Amortized inference
- Richer posteriors

Practicals

Statistics of datasets (ipynb)
Angles and distances between images (ipynb)
Orthogonal projections (ipynb)
Principal component analysis (ipynb)
Linear regression (ipynb)
Gaussian processes (ipynb, from GP summer school)
Sampling
Gaussian mixture models (ipynb)
Logistic regression (ipynb)

References

Deisenroth et al.: Mathematics for Machine Learning
Coursera course on empirical statistics, inner products, orthogonal projections and PCA
Bishop: Pattern Recognition and Machine Learning, 2006
MacKay: Information Theory, Inference, and Learning Algorithms, 2003
Strang: Introduction to Linear Algebra

Team

Marc Deisenroth (Lecturer)
Kossi Amouzouvi (Tutor, AIMS Rwanda)
Oluwafemi Azeez (Tutor, CMU Africa)
Steindór Sæmundsson (Tutor, Imperial College London)
Pedro Martinez Mediano (Tutor, Imperial College London)

Foundations of Machine Learning (2018/19)

African Institute for Mathematical Sciences, Kigali, Rwanda

Part 1: Mathematical Foundations

Part 2: Machine Learning

Practicals

References

Team

Marc Deisenroth

Google DeepMind Chair of Machine Learning and Artificial Intelligence