LIVE TRAINING: April 6th:

1 PM EST*

 

Save 30% : Register Now

Price: $147

Regular price $210 , discounted 30%

  • 4 hour immersive session

  • Hands-on training with Q&A

  • Recording available on-demand

  • Certification of Completion

30% Discount Ends in:

Subscribe and get an additional 10% to 35% off ALL live training session

View Plans

Meet Your Instructor

Boris Paskhaver

Boris Paskhaver is a full-stack web developer based in New York City with experience building apps in React / Redux and Ruby on Rails. His favorite part of programming is the never-ending sense that there’s always something new to master — a secret language feature, a popular design pattern, an emerging library or — most importantly — a different way of looking at a problem.

Why Enroll?

By the end of the course, participants will be able to:

  • Apply concepts to a NEW dataset , and watch and practice the concepts.
  • Have a solid grasp of the capabilities of the pandas library.
  • Perform various data manipulations – sorting, joining, cleaning, aggregating, deduping, and more.

Course Overview

This tutorial offers a comprehensive introduction to the powerful pandas library for data analysis built on top of the Python programming language. Pandas represents a great step forward for graphical spreadsheet users looking to grow their data manipulation skills. I like to call it “Excel on steroids”. By completing this workshop, you’ll have a strong foundation for using Pandas in your day-to-day data analysis needs. We’ll start out with the basics — importing datasets, selecting rows and columns, filtering rows by criteria — and progress to advanced concepts like grouping values, joining multiple datasets together, and cleaning text. Students will be exposed to diverse data sets across different disciplines — sports, finance, entertainment, and more. The training is open to all industries and is targeted for beginners — basic Python knowledge is preferred but not required.

Course Outline

Lesson 1. Foundations of pandas:

Discover the 1-dimensional Series and the 2-dimensional DataFrame, the two core data structures in pandas.
Learn how to sort values across one or more columns, identify missing values, remove duplicates, count occurrences of values, filter rows based on one or more criteria, and more.
At the end of this lesson, you’ll have knowledge of the most popular features of pandas.

Lesson 2. Working with Data of Different Types

Data can come in a variety of formats (and be messy to boot)! In this lesson, we’ll explore how to convert columns from one data type to another.
We’ll optimize our dataset to reduce memory consumption.
We’ll discover how to clean messy text data and how to extract date-time information from text.
Finally, we’ll have a chance to review all concepts from the previous lesson with new datasets.

Lesson 3. Working with Text Data

Real-world text data can be riddled with issues — whitespace, letter casings, inconsistent formats, and more.
In this lesson, we’ll learn how to to clean text data in pandas.
We’ll apply text operations like splitting, replacing, and joining to whole columns of data.
We’ll conclude with a quick discussion on regular expressions, which allow us to define search patterns for text.

Lesson 4. Aggregating and Joining Datasets

In this section, we’ll learn how to merge data across multiple datasets.
We’ll explore the pandas equivalent of common SQL operations like inner joins, outer joins, left joins, and right joins.
We’ll also introduce the GroupBy object for grouping rows by shared values across one or more columns.
Finally, we’ll walk through common aggregation operations like pivoting, melting, stacking, unstacking, and more.

Key Details

DATE

TIME:

DURATION:

LEVEL:

APRIL 16TH, 2021

TIME: 1 PM EST, 10 AM PST

3 HOURS

BEGINNER TO INTERMEDIATE

Prerequisites

Basic/intermediate experience with spreadsheet software (Excel, Google Sheets, etc.)

Basic experience with Python programming language would be helpful but not required. We also recommend these AI+ courses

Upcoming Live Training

March 10th

Part 1: Probability and Statistic Course

This class, Probability & Information Theory, introduces the mathematical fields that enable us to quantify uncertainty as well as to make predictions despite uncertainty. You’ll develop a working understanding of variables, probability distributions, metrics for assessing distributions, and graphical models. 

learn more
Open Data Science

Ai+ | ODSC
One Broadway, 14th Floor
Cambridge, MA 02142
admin_aiplus@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google