Foundations for Machine Learning

Learn the #1 skill required to succeed as a machine learning engineer or data scientist

Math and Statistics

Unlike other math and statistics courses, this foundations series is built from the ground up to boost your understanding of machine learning principles. 

  • Available On-Demand Linear Algebra for Machine Learning

    This topic, Intro to Linear Algebra, is the first in the Machine Learning Foundations series. It is essential because linear algebra lies at the heart of most machine learning approaches and is especially predominant in deep learning, the branch of ML at the forefront of today’s artificial intelligence advances.

  • Available On-Demand Calculus for Machine Learning

    This topic, Calculus I: Limits & Derivatives, introduces the mathematical field of calculus -- the study of rates of change -- from the ground up. It is essential because computing derivatives via differentiation is the basis of optimizing most machine learning algorithms, including those used in deep learning

  • Available On-Demand Probability and Statistics

    Probability & Information Theory introduces the mathematical fields that enable us to quantify uncertainty as well as to make predictions despite uncertainty. These fields are essential because ML algorithms are both trained by imperfect data and deployed into noisy, real-world scenarios.

  • Available On-Demand Computer Science

    This session, Algorithms & Data Structures, introduces the most important computer science topics for machine learning, enabling you to design and deploy computationally efficient data models.


Meet Your Instructor : Dr. Jon Krohn 

Jon Krohn is Chief Data Scientist at the machine learning company, untapt. He authored the 2019 book Deep Learning Illustrated, an instant #1 bestseller that was translated into six languages. Jon is renowned for his compelling lectures, which he offers in-person at Columbia University, New York University, and the NYC Data Science Academy. Jon holds a Ph.D. in Neuroscience from Oxford and has been publishing on machine learning In leading academic journals since 2010; his papers have been cited over a thousand times.

Student Testimonials

A better experience with calculus than I have had in the past, certainly!  looking forward to the next topics – thanks Jon!

Kenyon Maree, Teaching Fellow

Excellent, no critics.”

Sami Bahigr, Researcher

Jon did a great job explaining all of the math that underpins machine learning from linear algebra to calc to stats and pulled it all together at the end for a proper understanding of what’s going on under the hood.

Dr Philip Walsh, Data Scientist

How It Works

  • The foundations series is available on demand.

  • Each course is available on-demand as soon as you register.

  • Study the courses in order or skip the subjects you are already know.

  • Each course includes exercises to improve learning outcomes.

  • Coding demos allow you to learn hands-on skills.

  • Learn at your own pace. Courses can be taken alongside additional Ai+ courses.

Interactive Sessions

Hands-On Coding Demos

Learning Comprehension Exercises

What You Will Learn

Not only will you learn the core mathematical concepts, but you will also learn how they are applied to machine learning. In addition, you will learn to apply your knowledge using some of the key machine learning and deep learning platforms, such as Tensorflow and PyTorch.

Linear Algebra

Data Structures for Algebra

  • What Linear Algebra Is, A Brief History of Algebra

  • Vectors and Vector Transposition

  • Norms and Unit Vectors

  • Basis, Orthogonal, and Orthonormal Vectors

  • Arrays in NumPy, Matrices

  • Tensors in TensorFlow and PyTorch

Common Tensor Operations

  • Tensors, Scalars

  • Tensor Transposition

  • Basic Tensor Arithmetic

  • Reduction

  • The Dot Product

  • Solving Linear Systems

Matrix Properties

  • The Frobenius Norm

  • Matrix Multiplication

  • Symmetric and Identity Matrices

  • Matrix Inversion

  • Diagonal Matrices

  • Orthogonal Matrices


  • Eigenvectors

  • Eigenvalues

  • Matrix Determinants

  • Matrix Decomposition

  • Application of Eigendecomposition

Matrix Operations for Machine Learning

  • Singular Value Decomposition (SVD)

  • The Moore-Penrose Pseudoinverse

  • The Trace Operator

  • Principal Component Analysis (PCA): A Simple Machine Learning Algorithm

  • Resources for Further Study of Linear Algebra



  • What Calculus is

  • A Brief History of Calculus

  • The Method of Exhaustion

  • Matrix Decomposition

  • Application of Eigendecomposition

Computing Derivatives with Differentiation

  • The Delta Method

  • Basic Derivative Properties

  • The Power Rule

  • The Sum Rule

  • The Product Rule

  • The Quotient Rule & The Chain Rule

Automatic Differentiation

  • AutoDiff with Pytorch

  • AutoDiff with TensorFlow 2

  • Relating Differentiation to Machine Learning

  • Cost (or Loss) Functions

  • The Future: Differentiable Programming

Gradients Applied to Machine Learning

  • Partial Derivatives of Multivariate Functions

  • The Partial-Derivative Chain Rule

  • Cost (or Loss) Functions

  • Gradients

  • Gradient Descent

  • Backpropagation

  • Higher-Order Partial Derivatives


  • Binary Classification

  • The Confusion Matrix

  • The Receiver-Operating Characteristic (ROC) Curve

  • Calculating Integrals Manually

  • Numeric Integration with Python

  • Finding the Area Under the ROC Curve

  • Resources for Further Study of Calculus

Probability and Statistics

Introduction to Probability

  • What Probability Theory Is

  • Applications of Probability to Machine Learning

  • Discrete vs Continuous Variables

  • Probability Density Function 

  • Expected Value

  • Measures of Central Tendency

  • Quantiles: Quartiles, Deciles, and Percentiles

  • Measures of Dispersion:

  • Covariance and Correlation

  • Marginal and Conditional Probabilities

Distribution in Machine Learning

  • Uniforms

  • Gaussian: Normal and Standard Normal

  • The Central Limit Theorem

  • Log-Normal

  • Binominal and Multinomial

  • Poisson

  • Mixture Distributions

  • Preprocessing Data for Model Input

 Information Theory

  • What Information Theory Is

  • Self-Information

  • Nats, Bits and Shannons

  • Shannon and Differential Entropy

  • Kullback-Leibler Divergence

  • Cross-Entropy

Frequentist Statistics

  • Frequentist vs Bayesian Statistics

  • Review of Relevant Probability Theory

  • Z-scores and Outliers

  • P-values

  • Comparing Means with t-tests

  • Confidence Intervals

  • ANOVA: Analysis of Variance

  • Pearson Correlation Coefficient

  • R-Squared Coefficient 

  • Correlation vs Causation

  • Multiple Comparisons


  • Features: Independent vs Dependent Variables

  • Linear Regression to Predict Continuous Values

  • Fitting a Line to Points on a Cartesian Plane

  • Ordinary Least Squares

  • Logistic Regression to Predict Categories

  • (Deep) ML vs Frequentist Statistics

 Bayesian Statistics

  • When to Use Bayesian Statistics

  • Prior Probabilities

  • Bayes’ Theorem

  • PyMC3 Notebook

  • Resources for Further Study of Probability and Statistics

Computer Science

Introduction to Data Structures and Algorithms

  • Introduction to Data Structures

  • Introduction to Computer Algorithms

  • A Brief History of Data

  • A Brief History of Algorithms

  • “Big O” Notation for Time and Space Complexity

Lists and Dictionaries

  • List-Based Data Structures: Arrays, Linked Lists, Stacks, Queues, and Deques

  • Searching and Sorting: Binary, Bubble, Merge, and Quick

  • Set-Based Data Structures: Maps and Dictionaries

  • Tables, Load Factors, and Maps

Trees and Graphs

  • Trees: Decision Trees, Random Forests, and Gradient-Boosting (XGBoost)

  • Graphs: Terminology, Directed Acyclic Graphs (DAGs)

  • Resources for Further Study of Data Structures & Algorithms

The Machine Learning Approach to Optimization & Fancy Deep Learning Optimizers

  • The Statistical Approach to Regression: Ordinary Least Squares

  • When Statistical Approaches to Optimization Break Down

  • The Machine Learning Solution

  • A Layer of Artificial Neurons in PyTorch

  • Jacobian Matrices

  • Hessian Matrices and Second-Order Optimization

  • Momentum

  • Nesterov Momentum

  • AdaGrad, AdaDelta, RMSProp, Adam, Nadam

  • Training a Deep Neural Net

  • Resources for Further Study

Gradient Descent

  • Objective Functions

  • Cost / Loss / Error Functions

  • Minimizing Cost with Gradient Descent

  • Learning Rate

  • Critical Points, incl. Saddle Points

  • Gradient Descent from Scratch with PyTorch

  • The Global Minimum and Local Minima

  • Mini-Batches and Stochastic Gradient Descent (SGD)

  • Learning Rate Scheduling

  • Maximizing Reward with Gradient Ascent


Access to ONE selected Foundations course 

Certificate of completion

Course Assessments

Access to 4 Foundations courses 

Access to All AI+ course library


Per Course


all 4 courses




Programming: All code demos will be in Python, so experience with it, or another object-oriented programming language, would be helpful for following along with the code examples.

Mathematics: Familiarity with secondary school-level mathematics will make the class easier to follow. If you are comfortable dealing with quantitative information — such as understanding charts and rearranging simple equations — then you should be well prepared to follow along with all the mathematics.  

Learn More


Machine Learning Foundations: Linear Algebra

Through the measured exposition of theory paired with interactive examples, you’ll develop an understanding of how linear algebra is used to solve for unknown values in high-dimensional spaces, thereby enabling machines to recognize patterns and make predictions.

learn more


Machine Learning Fundamentals – Calculus

Through the measured exposition of theory paired with interactive examples, you’ll develop a working understanding of how calculus is used to compute limits and differentiate functions. You’ll also learn how to apply automatic differentiation within the popular TensorFlow 2 and PyTorch machine learning libraries. 

learn more


Machine Learning Foundations: Probability and Statistics

Through the measured exposition of theory paired with interactive examples, you’ll develop a working understanding of variables, probability distributions, metrics for assessing distributions, and graphical models. You’ll also learn how to use information theory to measure how much meaningful signal there is within some given data. 

learn more


Machine Learning Foundations: Computer Science

Through the measured exposition of theory paired with interactive examples, you’ll develop a working understanding of all of the essential data structures across the list, dictionary, tree, and graph families. You’ll also learn the key algorithms for working with these structures, including those for searching, sorting, hashing, and traversing data.

learn more
Open Data Science

Ai+ | ODSC
One Broadway, 14th Floor
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from Youtube
Consent to display content from Vimeo
Google Maps
Consent to display content from Google