
LIVE TRAINING: September 21st
12 PM EST
Price: $189
Regular price $210, discounted 10%
4 hour immersive session
Hands-on training with Q&A
Recording available on-demand
Certification of Completion
LAST CHANCE TO JOIN!
10% Discount Ends in:
Subscribe and get an additional 10% to 35% off ALL live training session
Meet your Instructor
Matt Brems
Matt is currently Managing Partner and Principal Data Scientist at BetaVector. His full-time professional data work spans finance, education, consumer-packaged goods, and politics and he earned General Assembly’s 2019 “Distinguished Faculty Member of the Year” award. Matt earned his Master’s degree in statistics from Ohio State. Matt is passionate about responsibly putting the power of machine learning into the hands of as many people as possible and mentoring folx in data and tech careers. Matt also volunteers with Statistics Without Borders and currently serves on their Executive Committee as the Marketing & Communications Director.
Course Overview
If you’ve never heard of the good, fast, cheap, dilemma, it goes something like this:
You can have something good and fast, but it won’t be cheap. You can have something
good and cheap, but it won’t be fast. You can have something fast and cheap, but it
won’t be good. In short, you can pick two of the three but you can’t have all three. If
you’ve tackled a data science problem before, I can all but guarantee that you’ve run into missing data. How do we handle it? Well, we can avoid, ignore, or try to account for missing data. The problem is, none of these strategies are good, fast, *and* cheap.
We’ll start by visualizing missing data and identify the three different types of missing data, which will allow us to see how they affect whether we should avoid, ignore, or account for the missing data. We will walk through the advantages and disadvantages of each approach as well as how to visualize and implement approaches. We’ll wrap up with practical tips for working with missing data and recommendations for integrating it with your workflow!
Why Enroll?
By the end of the course, participants will be able to:
Describe the impact of missing data using simulations and identify techniques for avoiding missing data and give specific examples of how to avoid missing data.
Define unit and item missingness, and identify when they occur and implement weight class adjustments, and identify advantages and disadvantages of this technique.
Define and give examples of data that are missing completely at random (MCAR), missing at random (MAR), and not missing at random (NMAR) and describe a workflow for doing data science with missing data.
Describe proper regression imputation and the pattern submodel method and select the best missing data technique given your situation and real-world constraints.
10% discount is ending soon
REGISTER NOWCourse Outline
Introduction to Missing Data
Strategies for doing Data Science with Missing Data
- Avoid Missing Data
- Ignore Missing Data
Account for Missing Data
- Unit missingness vs. item missingness
- Weight class adjustments for unit missingness
- The three types of missing data
- Imputation techniques (deductive, single, multiple)
- Pattern submodel method
Putting it together in a workflowÂ
Practical Consideration and Warnings
Key Details
DATE
TIME:
DURATION:
LEVEL:
SEPTEMBER 21ST, 2021
TIME: 12 PM EST, 9 AM PST
4 HOURS
BEGINNER – INTERMEDIATE
Prerequisites
No background knowledge is required.
It is helpful if you have an understanding of linear regression and basic statistical concepts (like standard deviation and p-values), but you should be able to follow along without this background.
It can also be helpful if you know Python, but all of the code is given for you to follow along. Even if your coding background is in R or SAS or you have no coding background at all, you should be able to easily follow along and understand!
Upcoming Live Training

September 30th
Network Analysis Made Simple
Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.