Data Foundations for Machine Learning
Learn the #1 skill required to succeed as a machine learning engineer or data scientist
Math and Statistics
Unlike other math and statistics courses, this foundations series is built from the ground up to boost your understanding of machine learning principles.Â
-
Data Primer Course
August 17th, 2023Data Primer Course
Data is the essential building block of Data Science, Machine Learning, and AI. This course is the first in the series and is designed to teach you the foundational skills and knowledge required to understand, work with, and analyze data. It covers topics such as data collection, organization, profiling, and transformation as well as basic analysis.
The course is aimed at helping people begin their AI journey and gain valuable insights that we will build up in subsequent SQL, programming, and AI courses.
August 17th, 2023 Data Primer Course
Data is the essential building block of Data Science, Machine Learning, and AI. This course is the first in the series and is designed to teach you the foundational skills and knowledge required to understand, work with, and analyze data. -
SQL Primer Course
September 7, 2023Â SQL Primer Course
This SQL coding course teaches students the basics of Structured Query Language, which is a standard programming language used for managing and manipulating data and an essential tool in AI.  The course covers topics such as database design and normalization, data wrangling, aggregate functions, subqueries, and join operations, and students will learn how to design and write SQL code to solve real-world problems. Upon completion, students will have a strong foundation in SQL and be able to use it effectively to extract insights from data. The ability to effectively access, retrieve, and manipulate data using SQL is essential for data cleaning, pre-processing, and exploration, which are crucial steps in any data science or machine learning project. Additionally, SQL is widely used in industry, making it a valuable skill for professionals in the field. This course builds upon the earlier data course in the series.
September 7, 2023 SQL Primer Course
This SQL coding course teaches students the basics of Structured Query Language, which is a standard programming language used for managing and manipulating data and an essential tool in AI.  The course covers topics such as database design and normalization, data wrangling, aggregate functions, subqueries, and join operations, -
Programming Primer Course with Python
September 21st, 2023Programming Primer Course with Python
The Python language is one of the most popular programming languages in data science and machine learning as it offers a number of powerful and accessible libraries and frameworks specifically designed for these fields. This programming course is designed to give participants a quick introduction to the basics of coding using the Python language.
It covers topics such as data structures, control structures, functions, modules, and file handling. This course aims to provide a basic foundation in Python and help participants develop the skills needed to progress in the field of data science and machine learning.
September 21st, 2023 Programming Primer Course with Python
The Python language is one of the most popular programming languages in data science and machine learning as it offers a number of powerful and accessible libraries and frameworks specifically designed for these fields. This programming course is designed to give participants a quick introduction to the basics of coding using the Python language. -
AI Primer Course
October 5th, 2023AI Primer CourseÂ
This AI literacy course is designed to introduce participants to the basics of artificial intelligence (AI) and machine learning. We will first explore the various types of AI and then progress to understand fundamental concepts such as algorithms, features, and models. We will study the machine learning workflow and how it is used to design, build, and deploy models that can learn from data to make predictions. This will cover model training and types of machine learning including supervised, and unsupervised learning, as well as some of the most common models such as regression and k-means clustering.
Upon completion, individuals will have a foundational understanding of machine learning and its capabilities and be well-positioned to take advantage of introductory-level hands-on training in machine learning and data science such as ODSC East’s Mini-Bootcamp.
October 5th, 2023 AI Primer Course
This AI literacy course is designed to introduce participants to the basics of artificial intelligence (AI) and machine learning. We will first explore the various types of AI and then progress to understand fundamental concepts such as algorithms, features, and models. -
Data Wrangling with Python Course
October 19th, 2023Data Wrangling with Python CourseÂ
Data wrangling is the cornerstone of any data-driven project, and Python stands as one of the most powerful tools in this domain. In preparation for the ODSC conference, our specially designed course on “Data Wrangling with Python” offers attendees a hands-on experience to master the essential techniques. From cleaning and transforming raw data to making it ready for analysis, this course will equip you with the skills needed to handle real-world data challenges. As part of a comprehensive series leading up to the conference, this course not only lays the foundation for more advanced AI topics but also aligns with the industry’s most popular coding language .
Upon completion of this short course attendees will be fully equipped with the knowledge and skills to manage the data lifecycle and turn raw data into actionable insights, setting the stage for advanced data analysis and AI applications.
October 19th, 2023 Data Wrangling with Python Course
Data Wrangling with Python Course Data wrangling is the cornerstone of any data-driven project, and Python stands as one of the most powerful tools in this domain. In preparation for […] -
LLMs, Prompt Engineering, and Generative AI
Coming Fall 2023In the rapidly evolving field of AI, the “LLMs, Prompt Engineering, and Generative AI” course stands as a cutting-edge offering, designed to equip learners with the latest advancements in Large Language Models (LLMs), prompt engineering, and generative AI techniques. This course delves into the architecture and functioning of LLMs, the art of crafting effective prompts to guide AI responses, and the principles behind generating creative and coherent content. As these components are becoming integral to the AI stack, understanding them is essential for anyone looking to innovate, optimize, and excel in AI-driven applications.
Whether you’re a researcher, developer, or AI enthusiast, this course will provide you with the insights and hands-on experience needed to harness the power of these transformative technologies and stay at the forefront of the AI revolution.
Coming Fall 2023 LLMs, Prompt Engineering, and Generative AI
In the rapidly evolving field of AI, the “LLMs, Prompt Engineering, and Generative AI” course stands as a cutting-edge offering, designed to equip learners with the latest advancements in Large […]
How It Works
The foundations series is available on demand.
Each course is available on-demand as soon as you register.
Study the courses in order or skip the subjects you are already know.
Each course includes exercises to improve learning outcomes.
Coding demos allow you to learn hands-on skills.
Learn at your own pace. Courses can be taken alongside additional Ai+ courses.
Interactive Sessions
Hands-On Coding Demos
Learning Comprehension Exercises
What You Will Learn
Not only will you learn the core mathematical concepts, but you will also learn how they are applied to machine learning. In addition, you will learn to apply your knowledge using some of the key machine learning and deep learning platforms, such as Tensorflow and PyTorch.
Linear Algebra
Data Structures for Algebra
What Linear Algebra Is, A Brief History of Algebra
Vectors and Vector Transposition
Norms and Unit Vectors
Basis, Orthogonal, and Orthonormal Vectors
Arrays in NumPy, Matrices
Tensors in TensorFlow and PyTorch
Common Tensor Operations
Tensors, Scalars
Tensor Transposition
Basic Tensor Arithmetic
Reduction
The Dot Product
Solving Linear Systems
Matrix Properties
The Frobenius Norm
Matrix Multiplication
Symmetric and Identity Matrices
Matrix Inversion
Diagonal Matrices
Orthogonal Matrices
Eigendecomposition
Eigenvectors
Eigenvalues
Matrix Determinants
Matrix Decomposition
Application of Eigendecomposition
Matrix Operations for Machine Learning
Singular Value Decomposition (SVD)
The Moore-Penrose Pseudoinverse
The Trace Operator
Principal Component Analysis (PCA): A Simple Machine Learning Algorithm
Resources for Further Study of Linear Algebra
Calculus
Limits
What Calculus is
A Brief History of Calculus
The Method of Exhaustion
Matrix Decomposition
Application of Eigendecomposition
Computing Derivatives with Differentiation
The Delta Method
Basic Derivative Properties
The Power Rule
The Sum Rule
The Product Rule
The Quotient Rule & The Chain Rule
Automatic Differentiation
AutoDiff with Pytorch
AutoDiff with TensorFlow 2
Relating Differentiation to Machine Learning
Cost (or Loss) Functions
The Future: Differentiable Programming
Gradients Applied to Machine Learning
Partial Derivatives of Multivariate Functions
The Partial-Derivative Chain Rule
Cost (or Loss) Functions
Gradients
Gradient Descent
Backpropagation
Higher-Order Partial Derivatives
 Integrals
Binary Classification
The Confusion Matrix
The Receiver-Operating Characteristic (ROC) Curve
Calculating Integrals Manually
Numeric Integration with Python
Finding the Area Under the ROC Curve
Resources for Further Study of Calculus
Probability and Statistics
Introduction to Probability
What Probability Theory Is
Applications of Probability to Machine Learning
Discrete vs Continuous Variables
Probability Density FunctionÂ
Expected Value
Measures of Central Tendency
Quantiles: Quartiles, Deciles, and Percentiles
Measures of Dispersion:
Covariance and Correlation
Marginal and Conditional Probabilities
Distribution in Machine Learning
Uniforms
Gaussian: Normal and Standard Normal
The Central Limit Theorem
Log-Normal
Binominal and Multinomial
Poisson
Mixture Distributions
Preprocessing Data for Model Input
 Information Theory
What Information Theory Is
Self-Information
Nats, Bits and Shannons
Shannon and Differential Entropy
Kullback-Leibler Divergence
Cross-Entropy
Frequentist Statistics
Frequentist vs Bayesian Statistics
Review of Relevant Probability Theory
Z-scores and Outliers
P-values
Comparing Means with t-tests
Confidence Intervals
ANOVA: Analysis of Variance
Pearson Correlation Coefficient
R-Squared CoefficientÂ
Correlation vs Causation
Multiple Comparisons
Regression
Features: Independent vs Dependent Variables
Linear Regression to Predict Continuous Values
Fitting a Line to Points on a Cartesian Plane
Ordinary Least Squares
Logistic Regression to Predict Categories
(Deep) ML vs Frequentist Statistics
 Bayesian Statistics
When to Use Bayesian Statistics
Prior Probabilities
Bayes’ Theorem
PyMC3 Notebook
Resources for Further Study of Probability and Statistics
Computer Science
Introduction to Data Structures and Algorithms
Introduction to Data Structures
Introduction to Computer Algorithms
A Brief History of Data
A Brief History of Algorithms
“Big O” Notation for Time and Space Complexity
Lists and Dictionaries
List-Based Data Structures: Arrays, Linked Lists, Stacks, Queues, and Deques
Searching and Sorting: Binary, Bubble, Merge, and Quick
Set-Based Data Structures: Maps and Dictionaries
Tables, Load Factors, and Maps
Trees and Graphs
Trees: Decision Trees, Random Forests, and Gradient-Boosting (XGBoost)
Graphs: Terminology, Directed Acyclic Graphs (DAGs)
Resources for Further Study of Data Structures & Algorithms
The Machine Learning Approach to Optimization & Fancy Deep Learning Optimizers
The Statistical Approach to Regression: Ordinary Least Squares
When Statistical Approaches to Optimization Break Down
The Machine Learning Solution
A Layer of Artificial Neurons in PyTorch
Jacobian Matrices
Hessian Matrices and Second-Order Optimization
Momentum
Nesterov Momentum
AdaGrad, AdaDelta, RMSProp, Adam, Nadam
Training a Deep Neural Net
Resources for Further Study
Gradient Descent
Objective Functions
Cost / Loss / Error Functions
Minimizing Cost with Gradient Descent
Learning Rate
Critical Points, incl. Saddle Points
Gradient Descent from Scratch with PyTorch
The Global Minimum and Local Minima
Mini-Batches and Stochastic Gradient Descent (SGD)
Learning Rate Scheduling
Maximizing Reward with Gradient Ascent
What is Data Literacy?Â
Data Literacy build the vocabulary and insights that allow you to “speak data”. It is the ability to ask and answer meaningful questions by collecting, analysing, and making sense of data.Â
Being data literate means you can:

Data literate means you can:

Discover and take advantage of trends.

Understand how predictive models work .

Discover hidden patterns in data

Identify opportunities for new products and services.

Data Literacy is at the heart of data science and machine learning.
It will provide a foundation to help you grasp and learn the math, programming, engineering, visualization, and modeling in machine learning.
Data for All
Anyone can tell a story with data, not just data scientists. Many professional roles can benefit including marketing, analysts, engineers, and journalists.
This beautiful crafted but simple multi-line chart add impact and insight to a story.Â
Image credit: nyt.com
