Abstract:

This is a technical blog post of both an HTML file and .qmd file hosted on GitHub pages.

0. Quarto Type-setting

This document is rendered with Quarto, and configured to embed an images using the embed-resources option in the header.
If you wish to use a similar header, here’s is the format specification for this document:

format: 
  html:
    embed-resources: true

1. Setup

Step Up Code:

sh <- suppressPackageStartupMessages
sh(library(tidyverse))
sh(library(caret))
sh(library(naivebayes))
wine <- readRDS(gzcon(url("https://github.com/cd-public/D505/raw/master/dat/pinot.rds")))

2. Logistic Concepts

Why do we call it Logistic Regression even though we are using the technique for classification?

TODO: Explain.

3. Modeling

We train a logistic regression algorithm to classify a whether a wine comes from Marlborough using:

An 80-20 train-test split.
Three features engineered from the description
5-fold cross validation.

We report Kappa after using the model to predict provinces in the holdout sample.

# TODO

4. Binary vs Other Classification

What is the difference between determining some form of classification through logistic regression versus methods like \(K\)-NN and Naive Bayes which performed classifications.

TODO: Explain.

5. ROC Curves

We can display an ROC for the model to explain your model’s quality.

# You can find a tutorial on ROC curves here: https://towardsdatascience.com/understanding-the-roc-curve-and-auc-dd4f9a192ecb/

TODO: Explain.