Data Foundations for Machine Learning
Learn the #1 skill required to succeed as a machine learning engineer or data scientist
Math and Statistics
Unlike other math and statistics courses, this foundations series is built from the ground up to boost your understanding of machine learning principles.Â
-
Data Primer Course
Available On-DemandData Primer Course
Data is the essential building block of Data Science, Machine Learning, and AI. This course is the first in the series and is designed to teach you the foundational skills and knowledge required to understand, work with, and analyze data. It covers topics such as data collection, organization, profiling, and transformation as well as basic analysis.
The course is aimed at helping people begin their AI journey and gain valuable insights that we will build up in subsequent SQL, programming, and AI courses.
Available On-Demand Data Primer Course
Data is the essential building block of Data Science, Machine Learning, and AI. This course is the first in the series and is designed to teach you the foundational skills and knowledge required to understand, work with, and analyze data. -
SQL Primer Course
Available On-Demand SQL Primer Course
This SQL coding course teaches students the basics of Structured Query Language, which is a standard programming language used for managing and manipulating data and an essential tool in AI.  The course covers topics such as database design and normalization, data wrangling, aggregate functions, subqueries, and join operations, and students will learn how to design and write SQL code to solve real-world problems. Upon completion, students will have a strong foundation in SQL and be able to use it effectively to extract insights from data. The ability to effectively access, retrieve, and manipulate data using SQL is essential for data cleaning, pre-processing, and exploration, which are crucial steps in any data science or machine learning project. Additionally, SQL is widely used in industry, making it a valuable skill for professionals in the field. This course builds upon the earlier data course in the series.
Available On-Demand SQL Primer Course
This SQL coding course teaches students the basics of Structured Query Language, which is a standard programming language used for managing and manipulating data and an essential tool in AI.  The course covers topics such as database design and normalization, data wrangling, aggregate functions, subqueries, and join operations, -
Programming Primer Course with Python
April 6th, 2023Programming Primer Course with Python
The Python language is one of the most popular programming languages in data science and machine learning as it offers a number of powerful and accessible libraries and frameworks specifically designed for these fields. This programming course is designed to give participants a quick introduction to the basics of coding using the Python language.
It covers topics such as data structures, control structures, functions, modules, and file handling. This course aims to provide a basic foundation in Python and help participants develop the skills needed to progress in the field of data science and machine learning.
April 6th, 2023 Programming Primer Course with Python
The Python language is one of the most popular programming languages in data science and machine learning as it offers a number of powerful and accessible libraries and frameworks specifically designed for these fields. This programming course is designed to give participants a quick introduction to the basics of coding using the Python language. -
AI Primer Course
April 26, 2023AI Primer CourseÂ
This AI literacy course is designed to introduce participants to the basics of artificial intelligence (AI) and machine learning. We will first explore the various types of AI and then progress to understand fundamental concepts such as algorithms, features, and models. We will study the machine learning workflow and how it is used to design, build, and deploy models that can learn from data to make predictions. This will cover model training and types of machine learning including supervised, and unsupervised learning, as well as some of the most common models such as regression and k-means clustering.
Upon completion, individuals will have a foundational understanding of machine learning and its capabilities and be well-positioned to take advantage of introductory-level hands-on training in machine learning and data science such as ODSC East’s Mini-Bootcamp.
April 26, 2023 AI Primer Course
This AI literacy course is designed to introduce participants to the basics of artificial intelligence (AI) and machine learning. We will first explore the various types of AI and then progress to understand fundamental concepts such as algorithms, features, and models.
How It Works
The foundations series is available on demand.
Each course is available on-demand as soon as you register.
Study the courses in order or skip the subjects you are already know.
Each course includes exercises to improve learning outcomes.
Coding demos allow you to learn hands-on skills.
Learn at your own pace. Courses can be taken alongside additional Ai+ courses.
Interactive Sessions
Hands-On Coding Demos
Learning Comprehension Exercises
What You Will Learn
Not only will you learn the core mathematical concepts, but you will also learn how they are applied to machine learning. In addition, you will learn to apply your knowledge using some of the key machine learning and deep learning platforms, such as Tensorflow and PyTorch.
Linear Algebra
Data Structures for Algebra
What Linear Algebra Is, A Brief History of Algebra
Vectors and Vector Transposition
Norms and Unit Vectors
Basis, Orthogonal, and Orthonormal Vectors
Arrays in NumPy, Matrices
Tensors in TensorFlow and PyTorch
Common Tensor Operations
Tensors, Scalars
Tensor Transposition
Basic Tensor Arithmetic
Reduction
The Dot Product
Solving Linear Systems
Matrix Properties
The Frobenius Norm
Matrix Multiplication
Symmetric and Identity Matrices
Matrix Inversion
Diagonal Matrices
Orthogonal Matrices
Eigendecomposition
Eigenvectors
Eigenvalues
Matrix Determinants
Matrix Decomposition
Application of Eigendecomposition
Matrix Operations for Machine Learning
Singular Value Decomposition (SVD)
The Moore-Penrose Pseudoinverse
The Trace Operator
Principal Component Analysis (PCA): A Simple Machine Learning Algorithm
Resources for Further Study of Linear Algebra
Calculus
Limits
What Calculus is
A Brief History of Calculus
The Method of Exhaustion
Matrix Decomposition
Application of Eigendecomposition
Computing Derivatives with Differentiation
The Delta Method
Basic Derivative Properties
The Power Rule
The Sum Rule
The Product Rule
The Quotient Rule & The Chain Rule
Automatic Differentiation
AutoDiff with Pytorch
AutoDiff with TensorFlow 2
Relating Differentiation to Machine Learning
Cost (or Loss) Functions
The Future: Differentiable Programming
Gradients Applied to Machine Learning
Partial Derivatives of Multivariate Functions
The Partial-Derivative Chain Rule
Cost (or Loss) Functions
Gradients
Gradient Descent
Backpropagation
Higher-Order Partial Derivatives
 Integrals
Binary Classification
The Confusion Matrix
The Receiver-Operating Characteristic (ROC) Curve
Calculating Integrals Manually
Numeric Integration with Python
Finding the Area Under the ROC Curve
Resources for Further Study of Calculus
Probability and Statistics
Introduction to Probability
What Probability Theory Is
Applications of Probability to Machine Learning
Discrete vs Continuous Variables
Probability Density FunctionÂ
Expected Value
Measures of Central Tendency
Quantiles: Quartiles, Deciles, and Percentiles
Measures of Dispersion:
Covariance and Correlation
Marginal and Conditional Probabilities
Distribution in Machine Learning
Uniforms
Gaussian: Normal and Standard Normal
The Central Limit Theorem
Log-Normal
Binominal and Multinomial
Poisson
Mixture Distributions
Preprocessing Data for Model Input
 Information Theory
What Information Theory Is
Self-Information
Nats, Bits and Shannons
Shannon and Differential Entropy
Kullback-Leibler Divergence
Cross-Entropy
Frequentist Statistics
Frequentist vs Bayesian Statistics
Review of Relevant Probability Theory
Z-scores and Outliers
P-values
Comparing Means with t-tests
Confidence Intervals
ANOVA: Analysis of Variance
Pearson Correlation Coefficient
R-Squared CoefficientÂ
Correlation vs Causation
Multiple Comparisons
Regression
Features: Independent vs Dependent Variables
Linear Regression to Predict Continuous Values
Fitting a Line to Points on a Cartesian Plane
Ordinary Least Squares
Logistic Regression to Predict Categories
(Deep) ML vs Frequentist Statistics
 Bayesian Statistics
When to Use Bayesian Statistics
Prior Probabilities
Bayes’ Theorem
PyMC3 Notebook
Resources for Further Study of Probability and Statistics
Computer Science
Introduction to Data Structures and Algorithms
Introduction to Data Structures
Introduction to Computer Algorithms
A Brief History of Data
A Brief History of Algorithms
“Big O” Notation for Time and Space Complexity
Lists and Dictionaries
List-Based Data Structures: Arrays, Linked Lists, Stacks, Queues, and Deques
Searching and Sorting: Binary, Bubble, Merge, and Quick
Set-Based Data Structures: Maps and Dictionaries
Tables, Load Factors, and Maps
Trees and Graphs
Trees: Decision Trees, Random Forests, and Gradient-Boosting (XGBoost)
Graphs: Terminology, Directed Acyclic Graphs (DAGs)
Resources for Further Study of Data Structures & Algorithms
The Machine Learning Approach to Optimization & Fancy Deep Learning Optimizers
The Statistical Approach to Regression: Ordinary Least Squares
When Statistical Approaches to Optimization Break Down
The Machine Learning Solution
A Layer of Artificial Neurons in PyTorch
Jacobian Matrices
Hessian Matrices and Second-Order Optimization
Momentum
Nesterov Momentum
AdaGrad, AdaDelta, RMSProp, Adam, Nadam
Training a Deep Neural Net
Resources for Further Study
Gradient Descent
Objective Functions
Cost / Loss / Error Functions
Minimizing Cost with Gradient Descent
Learning Rate
Critical Points, incl. Saddle Points
Gradient Descent from Scratch with PyTorch
The Global Minimum and Local Minima
Mini-Batches and Stochastic Gradient Descent (SGD)
Learning Rate Scheduling
Maximizing Reward with Gradient Ascent
What is Data Literacy?Â
Data Literacy build the vocabulary and insights that allow you to “speak data”. It is the ability to ask and answer meaningful questions by collecting, analysing, and making sense of data.Â
Being data literate means you can:

Data literate means you can:

Discover and take advantage of trends.

Understand how predictive models work .

Discover hidden patterns in data

Identify opportunities for new products and services.

Data Literacy is at the heart of data science and machine learning.
It will provide a foundation to help you grasp and learn the math, programming, engineering, visualization, and modeling in machine learning.
Data for All
Anyone can tell a story with data, not just data scientists. Many professional roles can benefit including marketing, analysts, engineers, and journalists.
This beautiful crafted but simple multi-line chart add impact and insight to a story.Â
Image credit: nyt.com
