LIVE TRAINING
SAVE THE DATE: October 28th, 2021 @1pm EST
Brian, A.B. from Harvard and a Ph.D. from Brown in Applied Math, Data Scientist with expertise in healthcare and finance. He has held teaching and research positions at the University of Washington, UC-Berkeley, and the American University in Cairo.

LIVE TRAINING: Introduction to PYTHON for ProgrammingÂ
October 12th @12 PM EST
Price: $147
Regular price $210, discounted 10%
4 hour immersive session
Hands-on training with Q&A
Recording available on-demand
Certification of Completion
30% Discount Ends in:
Subscribe and get an additional 10% to 35% off ALL live training session
Pricing: $189Â $139
Regular price $210, discounted 30%
*Use Voucher code for additional $50 off: october2021
4 hour immersive session
Hands-on training with Q&A
Recording available on-demand
Certification of Completion
10% Discount Ends in:
Subscribe and get an additional 10% to 35% off ALL live training session
Meet Your Instructor
Brian LucenaÂ
Brian Lucena is Principal at Numeristical and the creator of StructureBoost, ML-Insights, and SplineCalib. His mission is to enhance the understanding and application of modern machine learning and statistical techniques. He does this through academic research, open-source software development, and educational content such as live stream classes and interactive Jupyter notebooks. Additionally, he consults for organizations of all sizes from small startups to large public enterprises. In previous roles, he has served as SVP of Analytics at PCCI, Principal Data Scientist at Clover Health, and Chief Mathematician at Guardian Analytics. He has taught at numerous institutions including UC-Berkeley, Brown, USF, and the Metis Data Science Bootcamp.
Course Overview
What’s the plan?Â
By the end of this live, hands-on, online course, you’ll understand in detail how Gradient Boosting models are fit as an ensemble of decision trees and apply that understanding to the feature engineering process, the various parameters of Gradient Boosting and their relative importance and how to appropriately choose them and gain familiarity with the various Gradient Boosting packages and the capabilities, strengths, and weaknesses of each. You will also learn how to interpret, understand, and evaluate a model: both qualitatively and quantitatively.
Course Outline
What’s the plan?Â
Matt Harrison has been working with Python and data since 2000. He has a computer science degree from Stanford. He has worked at many amazing companies, created cool products, wrote a couple books, and taught thousands Python and Data Science. Currently, he is working as a corporate trainer, author, and consultant through his company Metasnake, which provides consulting and teaches corporations how to be effective with Python and data science.
Course Overview
Python is a powerful tools that XXX
Join the live session with Brian Lucena
SAVE 30%Course Outline
Module 1:Â
Decision Trees, and Random Forests
Module 2:Â
 Gradient Boosting: Definition and History
Module 3:Â
Review of Gradient Boosting PackagesÂ
Module 4:Â
 Interpreting and Understanding Gradient Boosting ModelsÂ
Module 5:Â
 Application to Medical Data
- Decision Trees: Fitting Step functions
- Decision Trees vs Linear Regression
- Random Forests: Definition and Motivation
- Weaknesses of Random Forests
- Why Feature Engineering Matters
- Exercise: NBA Winner Prediction
- Boosting and Base Learners
- Boosting as Gradient Descent
- Role of the Loss Function
- Gradient Boosting vs Random Forest
- Gradient Boosting in Scikit-Learn
- Parameters of Gradient Boosting: which are most important?
- Intro to Parameter Tuning and Early Stopping
- Gradient Boosting with XGBoost
- Parameter Tuning: Grid Search
- Exercise: Write your own Grid Search
- Parameter Tuning: Bayesian Optimization
- LightGBM and CatBoost
- Handling of Missing Values
- Handling of Categorical Variables
- StructureBoost for Structured Categorical Variables
- Example: Predicting House Prices
- Global vs Local Explanations
- What “Feature Importances” actually measure
- ICE-plots for Global Interpretations
- ICE-plots to Assess Model Quality
- SHAP and the Shapley Value
- Exploring interactivity
- Caveats to Interpreting Models
- Exercise: Explaining the House Prediction Model
- Data Exploration
- The Histogram Pair function
- Building a Model
- Tuning Parameters
- Evaluating Quantitatively and Qualitatively
- Gaining Insights from a Model
- Exercise: You build the model!
Key Details
DATE
TIME:
DURATION:
LEVEL:
OCTOBER 28TH, 2021
TIME: 1 PM EST, 10 AM PST
4 HOURS
INTERMEDIATE
Prerequisites
*This course is geared to data scientists of all levels who wish to gain a deep understanding of Gradient Boosting and how to apply it to real-world situations. The ideal participant will have some experience with building models, how the Python data science toolkit (numpy, pandas, scikit-learn, matplotlib) and have experiencde fitting models on training sets, making predictions on test sets, and evaluating the quality of the model with metrics.
Upcoming Live Training
December 1st and 8th
Data Visualization – Seaborn
The Seaborn data visualization library in Python provides a simple and intuitive interface for making beautiful plots directly from a Pandas DataFrame. When users arrange their data in tidy form, the Seaborn plotting functions perform the heavy lifting by grouping, splitting, aggregating, and plotting data, often with a single line of code. This course provides comprehensive coverage of how to use all of the Seaborn plotting functions with real-life data.
Upcoming Live Training
XXth
XXX
XXX.