Goal
This is a binary classification assignment
- Engineering exactly 5 features.
- Predict
Churn
Submission
- One Model 1 you replied to email “Group \(n\)” with a link to a GitHub repository
- Update this repository with a
model_2.rds
dataframe and amodel_2.qmd
or rmd that created the dataframe. - You may update this submission as many times as you like until class starts on 14 Apr.
- NOT Midnight AOE
Setup
- You may use any libraries, but as a feature engineering assignment
tidyverse
may be sufficient.- The next most likely are dummy column and textual libraries.
- If you wish, you may use Python, Julia, or Observable in any manner you see fit and I will figure out how to assess it.
- For assessment, we will use “caret”.
library(tidyverse)
library(caret)
Dataframe
- We use the
BankChurners
dataframe.
<- readRDS(gzcon(url("https://cd-public.github.io/D505/dat/BankChurners.rds"))) bank
- You will necessarily perform some feature engineering as you see fit.
- Exactly five (5) features.
Save the dataframe
- In addition to a document like this, you will also need to submit your dataframe.
write_rds(bank, file="model_2.rds")
Assessment
- Assessments will be evaluated as follows:
- Note that the first five (5) features are selected.
- This ensures no more than 5 features are used.
- It ensures “Churn” is included.
- It reports the \(\kappa\) value
Kappa
train(Churn ~ .,
data = bank |> select(1,2,3,4,5,grep("Churn", colnames(bank))),
trControl = trainControl(method = "cv", number = 5),
method = "glm",
family = "binomial",
maxit = 5)$results['Kappa']