LIVE TRAINING: September 21st


Save 10% : Register Now

Price: $189

Regular price $210, discounted 10%

  • 4 hour immersive session

  • Hands-on training with Q&A

  • Recording available on-demand

  • Certification of Completion

10% Discount Ends in:

Subscribe and get an additional 10% to 35% off ALL live training session

View Plans

Meet your Instructor

Matt Brems

Matt is currently Managing Partner and Principal Data Scientist at BetaVector. His full-time professional data work spans finance, education, consumer-packaged goods, and politics and he earned General Assembly’s 2019 “Distinguished Faculty Member of the Year” award. Matt earned his Master’s degree in statistics from Ohio State. Matt is passionate about responsibly putting the power of machine learning into the hands of as many people as possible and mentoring folx in data and tech careers. Matt also volunteers with Statistics Without Borders and currently serves on their Executive Committee as the Marketing & Communications Director.

Course Overview

If you’ve never heard of the good, fast, cheap, dilemma, it goes something like this:
You can have something good and fast, but it won’t be cheap. You can have something
good and cheap, but it won’t be fast. You can have something fast and cheap, but it
won’t be good. In short, you can pick two of the three but you can’t have all three. If
you’ve tackled a data science problem before, I can all but guarantee that you’ve run into missing data. How do we handle it? Well, we can avoid, ignore, or try to account for missing data. The problem is, none of these strategies are good, fast, *and* cheap.
We’ll start by visualizing missing data and identify the three different types of missing data, which will allow us to see how they affect whether we should avoid, ignore, or account for the missing data. We will walk through the advantages and disadvantages of each approach as well as how to visualize and implement approaches. We’ll wrap up with practical tips for working with missing data and recommendations for integrating it with your workflow!

Why Enroll?

By the end of the course, participants will be able to:

  • Describe the impact of missing data using simulations and identify techniques for avoiding missing data and give specific examples of how to avoid missing data.

  • Define unit and item missingness, and identify when they occur and implement weight class adjustments, and identify advantages and disadvantages of this technique.

  • Define and give examples of data that are missing completely at random (MCAR), missing at random (MAR), and not missing at random (NMAR) and describe a workflow for doing data science with missing data.

  • Describe proper regression imputation and the pattern submodel method and select the best missing data technique given your situation and real-world constraints.

10% discount is ending soon


Course Outline

Introduction to Missing Data

Strategies for doing Data Science with Missing Data

  • Avoid Missing Data
  • Ignore Missing Data
Account for Missing Data
  • Unit missingness vs. item missingness
  • Weight class adjustments for unit missingness
  • The three types of missing data
  • Imputation techniques (deductive, single, multiple)
  • Pattern submodel method
Putting it together in a workflow 
Practical Consideration and Warnings

Key Details










No background knowledge is required.

It is helpful if you have an understanding of linear regression and basic statistical concepts (like standard deviation and p-values), but you should be able to follow along without this background.

It can also be helpful if you know Python, but all of the code is given for you to follow along. Even if your coding background is in R or SAS or you have no coding background at all, you should be able to easily follow along and understand!

Upcoming Live Training

September 30th

Network Analysis Made Simple

Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.

learn more
Open Data Science

Ai+ | ODSC
One Broadway, 14th Floor
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google