Goal
This is a generalized ML assignment
- Predict the profit of future products developed at CravenSpeed
- Using any model (or ensemble) you’d like
- Evaluated using RMSE on holdout sample (you won’t see)
Submission
- A maximum of 15 slides, without any code, that demonstrates:
- How and why you created/selected the features used,
- The choice and design of your model, and
- Results and insights.
- You must present from the Ford 102 “Teaching Machine” with no login from a public url.
Presentations
- It is a simple matter to create a presentation within Quarto.
- Simply specify “revealjs” format. Read more
The page
final_page.qmd
---
title: "Final Page"
author: "Team $i$"
date: "04/21/2025"
---
# Goal
...The presentation
final_present.qmd
---
title: "Final Presentation"
author: "Team $i$"
date: "04/21/2025"
format: revealjs
---
# Goal
...Criteria
- Every group member must participate in the presentation
- Maximum 10 features including interactions
Setup
- You may use any libraries, but
tidyverseandcaretmay be sufficient.- If you wish, you may use Python, Julia, or Observable in any manner you see fit and I will figure out how to assess it.
- Recall - no code on slides! So it won’t matter.
library(tidyverse)
library(caret)Dataframe
- We use the
craven_traindataframe.
fast <- readRDS(gzcon(url("https://github.com/cd-public/D505/raw/refs/heads/master/dat/craven_train.rds")))- You will necessarily perform some feature engineering as you see fit.
- Exactly ten (10) features.
- No relation to “Crazy Train”
An “engineer” function
- Besides the presentation
- Submit a .qmd or .rmd file that includes an “engineer” function
- It engineers your features over a data frame with the same columns as “craven_train.rds”.
fast <- readRDS("secret.rds") # I have "secret" data
fast <- fast %>% engineer() # I will apply your function.A bad example
# Engineer 10 features
engineer <- function(df) {
df |> select(1:10)
}Setup
- Assessments will be setup as follows:
- “Profit” is engineered.
- Note that the first five (10) features are selected.
- This ensures no more than 10 features are used.
- “Profit” is incorporated into the data frame.
profit <- fast["Revenue 2019 to present"] - fast["BOM Cost"] * fast["Units Sold"]
fast <- fast %>% engineer()
fast <- fast |> select(1:10) # Max 10 features
fast["Profit"] = profitAssessment
- Assessments will be evaluated via RMSE over the secret data as follows:
train(Profit ~ .,
data = fast,
method = "lm",
trControl = trainControl(method = "cv", number = 5))$results$RMSE