reticulate

SageMaker + RStudio to Predict Home Prices w/ Multi-class XGBoost; Explaining Model Behavior with Geospacial Plots and SHAP

In this post I explore an Austin Housing dataset and predict binned housing price. EDA includes static and interactive geospacial feature maps and feature engineering using natural language processing (NLP). After training/tuning multi-class XGBoost models , I run batch inference to predict the price of Austin, TX houses. I then submit predictions to the Kaggle competition which scrored 0.8876 (mlogloss), which would have placed 6th in the live competition. After submission, I generate SHapley Additive exPlanations (SHAP) plots to understand how XGBoost made predictions.

Predicting Bank Customer Churn using AWS SageMaker and XGBoost in Local RStudio

For this post, I experimented using AWS SageMaker with the AWS built-in XGBoost algorithm from within my local RStudio to predict whether a bank customer has churned. The data comes from the SLICED season 1 episode 7 Kaggle competition. SLICED is a data science competition where contestants are given a never-before-seen dataset and two-hours to code a solution to a prection challenge.