Foundations for Machine Learning
Learn the #1 skill required to succeed as a machine learning engineer or data scientist
Math and Statistics
Unlike other math and statistics courses, this foundations series is built from the ground up to boost your understanding of machine learning principles.Â

Linear Algebra for Machine Learning
Available OnDemandNow available ondemand. The course includes the following modules
 Intro to Linear Algebra
 Linear Algebra II: Matrix Operations
Over the course of studying this topic, you’ll:
 Understand the fundamentals of linear algebra, a ubiquitous approach for solving for unknowns within highdimensional spaces.
 Develop a geometric intuition of whatâ€™s going on beneath the hood of machine learning algorithms, including those used for deep learning.
 Be able to more intimately grasp the details of machine learning papers as well as all of the other subjects that underlie ML, including calculus, statistics, and
 Develop a geometric intuition of whatâ€™s going on beneath the hood of machine learning algorithms, including those used for deep learning.
 Be able to more intimately grasp the details of machine learning papers as well as all of the other subjects that underlie ML, including calculus, statistics, and optimization algorithms.
 Reduce the dimensionality of complex spaces down to their most informative elements with techniques such as eigendecomposition, singular value decomposition, and principal components analysis
Available OnDemand Linear Algebra for Machine Learning
This topic, Intro to Linear Algebra, is the first in the Machine Learning Foundations series. It is essential because linear algebra lies at the heart of most machine learning approaches and is especially predominant in deep learning, the branch of ML at the forefront of todayâ€™s artificial intelligence advances.

Calculus for Machine Learning
Available OnDemandNow available ondemand. The course includes the following modules
 Calculus I: Limits & Derivatives
 Calculus II: Partial Derivatives & Integrals
Over the course of studying this topic, you’ll:
 Develop an understanding of whatâ€™s going on beneath the hood of machine learning algorithms, including those used for deep learning.
 Be able to more intimately grasp the details of machine learning papers as well as many of the other subjects that underlie ML, including partialderivative calculus, statistics, and optimization algorithms.
 Compute the derivatives of functions, including by using AutoDiff in the popular TensorFlow 2 and PyTorch libraries.
 Develop an understanding of whatâ€™s going on beneath the hood of machine learning algorithms, including those used for deep learning.
 Be able to grasp the details of the partialderivative, multivariate calculus that is common in machine learning papers as well as many in other subjects that underlie ML, including information theory and optimization algorithms.
 Use integral calculus to determine the area under any given curve, a recurring task in ML applied, for example, to evaluate model performance by calculating the ROC AUC metric.
Available OnDemand Calculus for Machine Learning
This topic, Calculus I: Limits & Derivatives, introduces the mathematical field of calculus  the study of rates of change  from the ground up. It is essential because computing derivatives via differentiation is the basis of optimizing most machine learning algorithms, including those used in deep learning

Probability and Statistics
Available OnDemandNow available ondemand. The course includes the following module
 Probability and Information Theory
 Intro to Statistics
Over the course of studying this topic, you’ll:
 Develop an understanding of whatâ€™s going on beneath the hood of predictive statistical models and machine learning algorithms, including those used for deep learning.
 Understand the appropriate variable type and probability distribution for representing a given class of data, as well as the standard techniques for assessing the relationships between distributions.
 Apply information theory to quantify the proportion of valuable signal thatâ€™s present amongst the noise of a given probability distribution.
 Develop an understanding of whatâ€™s going on beneath the hood of predictive statistical models and machine learning algorithms, including those used for deep learning.
 Hypothesize about and critically evaluate the inputs and outputs of machine learning algorithms using essential statistical tools such as the ttest, ANOVA, and Rsquared.
 Use historical data to predict the future using regression models that take advantage of frequentist statistical theory (for smaller data sets) and modern machine learning theory (for larger data sets), including why we may want to consider applying deep learning to a given problem.
Available OnDemand Probability and Statistics
Probability & Information Theory introduces the mathematical fields that enable us to quantify uncertainty as well as to make predictions despite uncertainty. These fields are essential because ML algorithms are both trained by imperfect data and deployed into noisy, realworld scenarios.

Computer Science
Available OnDemandNow available ondemand. The course includes the following modules
 Algorithms and Data Structures
 Optimization
Over the course of studying this topic, you’ll:
 Use â€śBig Oâ€ť notation to characterize the time efficiency and space efficiency of a given algorithm, enabling you to select or devise the most sensible approach for tackling a particular machine learning problem with the hardware resources available to you.
 Get acquainted with the entire range of the most widelyused Python data structures, including list, dictionary, tree, and graphbased structures.
 Develop an understanding of all of the essential algorithms for working with data, including those for searching, sorting, hashing, and traversing.
 Discover how the statistical and machine learning approaches to optimization differ, and why you would select one or the other for a given problem youâ€™re solving.
 Find out how the extremely versatile (stochastic) gradient descent optimization algorithm works, including how to apply it — from a low, indepth level as well as from a high, abstracted level — within the most popular deep learning libraries, Tensorflow and PyTorch
 Get acquainted with the â€śfancyâ€ť optimizers that are available for advanced machine learning approaches (e.g., deep learning) and when you should consider using them.
Available OnDemand Computer Science
This session, Algorithms & Data Structures, introduces the most important computer science topics for machine learning, enabling you to design and deploy computationally efficient data models.
Meet Your Instructor : Dr. Jon KrohnÂ
Jon Krohn is Chief Data Scientist at the machine learning company, untapt. He authored the 2019 book Deep Learning Illustrated, an instant #1 bestseller that was translated into six languages. Jon is renowned for his compelling lectures, which he offers inperson at Columbia University, New York University, and the NYC Data Science Academy. Jon holds a Ph.D. in Neuroscience from Oxford and has been publishing on machine learning In leading academic journals since 2010; his papers have been cited over a thousand times.
Student Testimonials
How It Works

The foundations series is available on demand.

Each course is available ondemand as soon as you register.

Study the courses in order or skip the subjects you are already know.

Each course includes exercises to improve learning outcomes.

Coding demos allow you to learn handson skills.

Learn at your own pace. Courses can be taken alongside additional Ai+ courses.
Interactive Sessions
HandsOn Coding Demos
Learning Comprehension Exercises
What You Will Learn
Not only will you learn the core mathematical concepts, but you will also learn how they are applied to machine learning. In addition, you will learn to apply your knowledge using some of the key machine learning and deep learning platforms, such as Tensorflow and PyTorch.
Linear Algebra
Data Structures for Algebra

What Linear Algebra Is, A Brief History of Algebra

Vectors and Vector Transposition

Norms and Unit Vectors

Basis, Orthogonal, and Orthonormal Vectors

Arrays in NumPy, Matrices

Tensors in TensorFlow and PyTorch
Common Tensor Operations

Tensors, Scalars

Tensor Transposition

Basic Tensor Arithmetic

Reduction

The Dot Product

Solving Linear Systems
Matrix Properties

The Frobenius Norm

Matrix Multiplication

Symmetric and Identity Matrices

Matrix Inversion

Diagonal Matrices

Orthogonal Matrices
Eigendecomposition

Eigenvectors

Eigenvalues

Matrix Determinants

Matrix Decomposition

Application of Eigendecomposition
Matrix Operations for Machine Learning

Singular Value Decomposition (SVD)

The MoorePenrose Pseudoinverse

The Trace Operator

Principal Component Analysis (PCA): A Simple Machine Learning Algorithm

Resources for Further Study of Linear Algebra
Calculus
Limits

What Calculus is

A Brief History of Calculus

The Method of Exhaustion

Matrix Decomposition

Application of Eigendecomposition
Computing Derivatives with Differentiation

The Delta Method

Basic Derivative Properties

The Power Rule

The Sum Rule

The Product Rule

The Quotient Rule & The Chain Rule
Automatic Differentiation

AutoDiff with Pytorch

AutoDiff with TensorFlow 2

Relating Differentiation to Machine Learning

Cost (or Loss) Functions

The Future: Differentiable Programming
Gradients Applied to Machine Learning

Partial Derivatives of Multivariate Functions

The PartialDerivative Chain Rule

Cost (or Loss) Functions

Gradients

Gradient Descent

Backpropagation

HigherOrder Partial Derivatives
Â Integrals

Binary Classification

The Confusion Matrix

The ReceiverOperating Characteristic (ROC) Curve

Calculating Integrals Manually

Numeric Integration with Python

Finding the Area Under the ROC Curve

Resources for Further Study of Calculus
Probability and Statistics
Introduction to Probability

What Probability Theory Is

Applications of Probability to Machine Learning

Discrete vs Continuous Variables

Probability Density FunctionÂ

Expected Value

Measures of Central Tendency

Quantiles: Quartiles, Deciles, and Percentiles

Measures of Dispersion:

Covariance and Correlation

Marginal and Conditional Probabilities
Distribution in Machine Learning

Uniforms

Gaussian: Normal and Standard Normal

The Central Limit Theorem

LogNormal

Binominal and Multinomial

Poisson

Mixture Distributions

Preprocessing Data for Model Input
Â Information Theory

What Information Theory Is

SelfInformation

Nats, Bits and Shannons

Shannon and Differential Entropy

KullbackLeibler Divergence

CrossEntropy
Frequentist Statistics

Frequentist vs Bayesian Statistics

Review of Relevant Probability Theory

Zscores and Outliers

Pvalues

Comparing Means with ttests

Confidence Intervals

ANOVA: Analysis of Variance

Pearson Correlation Coefficient

RSquared CoefficientÂ

Correlation vs Causation

Multiple Comparisons
Regression

Features: Independent vs Dependent Variables

Linear Regression to Predict Continuous Values

Fitting a Line to Points on a Cartesian Plane

Ordinary Least Squares

Logistic Regression to Predict Categories

(Deep) ML vs Frequentist Statistics
Â Bayesian Statistics

When to Use Bayesian Statistics

Prior Probabilities

Bayesâ€™ Theorem

PyMC3 Notebook

Resources for Further Study of Probability and Statistics
Computer Science
Introduction to Data Structures and Algorithms

Introduction to Data Structures

Introduction to Computer Algorithms

A Brief History of Data

A Brief History of Algorithms

â€śBig Oâ€ť Notation for Time and Space Complexity
Lists and Dictionaries

ListBased Data Structures: Arrays, Linked Lists, Stacks, Queues, and Deques

Searching and Sorting: Binary, Bubble, Merge, and Quick

SetBased Data Structures: Maps and Dictionaries

Tables, Load Factors, and Maps
Trees and Graphs

Trees: Decision Trees, Random Forests, and GradientBoosting (XGBoost)

Graphs: Terminology, Directed Acyclic Graphs (DAGs)

Resources for Further Study of Data Structures & Algorithms
The Machine Learning Approach to Optimization & Fancy Deep Learning Optimizers

The Statistical Approach to Regression: Ordinary Least Squares

When Statistical Approaches to Optimization Break Down

The Machine Learning Solution

A Layer of Artificial Neurons in PyTorch

Jacobian Matrices

Hessian Matrices and SecondOrder Optimization

Momentum

Nesterov Momentum

AdaGrad, AdaDelta, RMSProp, Adam, Nadam

Training a Deep Neural Net

Resources for Further Study
Gradient Descent

Objective Functions

Cost / Loss / Error Functions

Minimizing Cost with Gradient Descent

Learning Rate

Critical Points, incl. Saddle Points

Gradient Descent from Scratch with PyTorch

The Global Minimum and Local Minima

MiniBatches and Stochastic Gradient Descent (SGD)

Learning Rate Scheduling

Maximizing Reward with Gradient Ascent
Prerequisites
Programming: All code demos will be in Python, so experience with it, or another objectoriented programming language, would be helpful for following along with the code examples.
Mathematics:Â FamiliarityÂ withÂ secondary schoollevel mathematics will make the class easier to follow. If you are comfortableÂ dealing withÂ quantitative informationÂ — such as understanding charts and rearranging simple equations — then you should be well prepared to follow along with all the mathematics. Â
Learn More
OnDemand
Machine Learning Foundations: Linear Algebra
Through the measured exposition of theory paired with interactive examples, youâ€™ll develop an understanding of how linear algebra is used to solve for unknown values in highdimensional spaces, thereby enabling machines to recognize patterns and make predictions.
OnDemand
Machine Learning Fundamentals – Calculus
Through the measured exposition of theory paired with interactive examples, youâ€™ll develop a working understanding of how calculus is used to compute limits and differentiate functions. Youâ€™ll also learn how to apply automatic differentiation within the popular TensorFlow 2 and PyTorch machine learning libraries.Â
OnDemand
Machine Learning Foundations: Probability and Statistics
Through the measured exposition of theory paired with interactive examples, youâ€™ll develop a working understanding of variables, probability distributions, metrics for assessing distributions, and graphical models. Youâ€™ll also learn how to use information theory to measure how much meaningful signal there is within some given data.Â
OnDemand
Machine Learning Foundations: Computer Science
Through the measured exposition of theory paired with interactive examples, youâ€™ll develop a working understanding of all of the essential data structures across the list, dictionary, tree, and graph families. Youâ€™ll also learn the key algorithms for working with these structures, including those for searching, sorting, hashing, and traversing data.