---
title: Classification
---
# Classification
## Why classification?
- Definitely AI topic!
- Recall Imagenet
- "Is this image a cat or dog?"
- It is the basis for some more advanced ML/AI stuff.
- We'll use it within our labs.
## Carolus Linnæus

## High School Biology
- The first time I heard of classification it was high school biology.
- Y'all *already know* I love a dichotomous key.
- I was told Carolus Linnæus invented the idea of categorizing things.
- This seems unlikely to be true but we can go with it.
- Specifically, Carl (we're on a first name basis) classified living things into species.
## Two Species
:::: {.columns}
::: {.column width="50%"}
*Felis catus*

:::
::: {.column width="50%"}
*Homo sapiens*

:::
::::
## Life is complicated
- Classification into species is a complex and multi-faceted process.
- In that respect, it is the negation of American action horror film production.
- So we study the simpler system: specifically, the *Alien vs. Predator* films.
- Specifically, from 1976 to 2024 (since I haven't been keeping up the past few years).
# Example
## We Begin...
:::: {.columns}
::: {.column width="50%"}
...in 1979 with the release of *Alien*, widely regard as the greatest film of the post-silent era by instructors of this class.
:::
::: {.column width="50%"}

:::
::::
## Classification
- In order to differentiate *Alien* from latter installments, we make a few notes.
- *Alien* starred Sigourney Weaver*
- *Alien* was directed by Ridley Scott
- *Alien* contains the term "Alien" in its name.
## Timeline
- We can place *Alien* on a timeline as follows:
```{python}
#| code-fold: true
import matplotlib.pyplot as plt
def plot(films):
years = list(range(1976, 2027)) # Create a list of years from 1976 to 2024
plt.figure(figsize=(9,1)) # Adjust figure size for better visualization
plt.plot(years, [0] * len(years), color='black', linewidth=2) # Plot a horizontal line
def add_film(name, year, color="green"):
plt.annotate(
name,
xy=(year, 0.0),
xytext=(year, 0.2),
arrowprops=dict(color=color, shrink=0.05, width=1, headwidth=8),
horizontalalignment='center',
verticalalignment='bottom',
fontsize=10,
color=color
)
[add_film(*film) for film in films]
plt.yticks([]) # Remove y-axis ticks as they are not relevant for a horizontal line
plt.grid(True, axis='x', linestyle='--', alpha=0.7)
plt.xlim(1976, 2027) # Set x-axis limits slightly beyond the data range
plt.show()
films = [
["Alien", 1979],
]
plot(films)
```
## An Aside
- *Alien* featured not one but **two** high profile depiction of artifical intelligence:
- MU/TH/UR, the ship computer
- Ash, a seemingly human crew member
- In fact, it raises important questions about *AI ethics*, especially related to labor issues.
## Special Order 937
- At one point in the film, the crew of spacecraft bring onboard a dangerous alien organism.
- The corporate ownership of the vessel which to retrieve the organism, despite the danger it poses to the crew.
- This leads to a famous sequence in cinema and AI history:

## Special Order 937
> Priority one — Ensure return of organism for analysis. All other considerations secondary. Crew expendable
## Commands
- This command, issued by MU/TH/UR to Ash, a "synthetic" passing as human, led to Ash bamboozling the human crew into various unsafe circumstances, and ultimately killing many of them.
- This plot point raises interesting questions that remain unanswered!
## Think, Pair, Share
- Find a partner and discuss:
- Can you trust an AI to do what is best for humans?
- Can you trust a computer to do what is best for humans?
- Can you trust a corporation to do what is best for humans?
- Can you trust humans to do what is best for humans?
## Up Next
:::: {.columns}
::: {.column width="50%"}
After a series of studio mishaps, the next official franchise film is *Aliens* in 1986.
> *Aliens* is about how director James Cameron (*Titanic*, *Avatar*, *Terminator*) does not understand art.
:::
::: {.column width="50%"}

:::
::::
## Classification
- In order to differentiate *Alien* from latter installments, we make a few notes.
- *Aliens* starred Sigourney Weaver
- *Aliens* was *not* directed by Ridley Scott
- *Aliens* contains the term "Alien" in its name.
## Timeline
- We can place *Aliens* on a timeline as follows:
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["Aliens", 1986],
]
plot(films)
```
## More
- *Alien 3* and *Alien Resurrection* were the next films in the franchise
- Both even worse and less interesting than *Aliens*, so we'll just add them and move on.
- Both starred Sigourney Weaver
- Both were *not* directed by Ridley Scott
- Both contained the term "Alien" in their name.
## Timeline
- We can the Sigourney Weaver *Alien* films on a timeline as follows:
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["Aliens", 1986],
["Alien 3", 1992],
["Alien Res.", 1997],
]
plot(films)
```
## The Big Mistake
- The 2000s were a time of silliness and goofery.
- And from the glorious excess can a blissful union of two great franchises.
- *Alien*, a four film franchise with one good film.
- *Predator* a two film franchise starring two future governors.
- Arnold Schwarzenegger (R-CA)
- Jesse Ventura (Reform-MN)
## More
- *Alien vs. Predator* and *AVPR: Alien vs. Predator - Requiem* were released in 2004 and 2007, respectively.
- While playful romps, they broke from tradition and switched from women to men as leads, and were set in present rather than future (both borrowing from *Predator*).
- Both *did not* star Sigourney Weaver
- Both were *not* directed by Ridley Scott
- Both contained the term "Alien" in their name.
## Timeline
- We can add the *AVP* films on a timeline as follows:
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["Aliens", 1986],
["Alien 3", 1992],
["Alien Res.", 1997],
["AVP", 2004, "darkgoldenrod"],
["AVPR", 2007, "darkgoldenrod"],
]
plot(films)
```
## Reverse Reverse
- While both AVP films would soon come into canon conflict with the Ridley Scott films, for a brief shining moment they retroactively brought two addition films into the now combined franchise.
- *Predator* (1987)
- *Predator 2* (1990)
- The less said about them the better.
## Classification
- Both *did not* star Sigourney Weaver
- Both were *not* directed by Ridley Scott
- Both *did not* contain the term "Alien" in their name.
- Both do contain the term "Predator" in their name.
- Like AVP films but unlike the other *Alien* films.
## Timeline
- We can add the pre-*AVP* *Predator* films.
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["AVP", 2004, "darkgoldenrod"],
["AVPR", 2007, "darkgoldenrod"],
["P", 1987, "red"],
["P2", 1990, "red"],
]
plot(films)
```
## Modern Era
- The AVP films renewed interest in the *Predator* sub-franchise, with several new releases.
- There's post 2024 releases I haven't considered as I haven't watched them.
- They are:
- *Predators* (2010) - the first "off-Earth" franchise film.
- *The Predator* (2018) - a soft reboot
- *Prey* - a prequel film and the first franchise film to star a woman in the leading role.
## Timeline
- We can add the "modern" *Predator* films.
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["AVP", 2004, "darkgoldenrod"],
["AVPR", 2007, "darkgoldenrod"],
["P", 1987, "red"],
["P2", 1990, "red"],
["P's", 2010, "darkred"],
["The", 2018, "darkred"],
["Prey", 2022, "darkred"],
]
plot(films)
```
## Return of the King 👑
:::: {.columns}
::: {.column width="50%"}
2012 marked the triumphant return of Ridley Scott the *Alien* franchise with *Prometheus*, a retelling of *Frankenstein* in the *Alien* universe raising essential questions about life, control, and - again - AI.
:::
::: {.column width="50%"}

:::
::::
## David
:::: {.columns}
::: {.column width="50%"}
David is an android serving as a butler, maintenance man, and surrogate son to his creator... David is obsessed with the concept of creating life of his own. After Weyland is killed, David is freed from servitude...
:::
::: {.column width="50%"}

:::
::::
## Inflection Point
A major plot point in *Prometheus* is the radicalization of David against human creators, portrayed through a repartee with human scientist.
- David : Why do you think your people made me?
- Human : We made you because we could.
- David : Can you imagine how disappointing it would be for you to hear the same thing from your creator?
- Human : I guess it's good you can't be disappointed.
## Frankenstein
:::: {.columns}
::: {.column width="50%"}
*Frankenstein; or, The Modern Prometheus* written by Mary Shelley in 1818 is widely regarded as the greatest original English text by instructors of this class.
:::
::: {.column width="50%"}

:::
::::
## Think, Pair, Share
- Centuries ahead of its time, in *Frankenstein* Mary Shelley approached questions like:
- Where is the line between humans and creations?
- Is there a line and does it matter?
- Why are creatures, including humans, created?
- Find a partner and discuss.
## Classification
- *Prometheus* *did not* star Sigourney Weaver...
- ...but did feature a woman in a leading role (Noomi Rapace)
- Was directed by Ridley Scott
- *Did not* contain the term "Alien" in their name.
- Or "Predator"
## A Wrinkle in Time
- While both the AVP films and *Prometheus* may exist in the same universe as *Alien*, they most exist together.
- In AVP films, "aliens" are present on earth in the modern day.
- In *Prometheus*, the "aliens" are genetic engineered off-planet by David.
## Timeline
- We can add *Prometheus*
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["AVP", 2004, "darkgoldenrod"],
["AVPR", 2007, "darkgoldenrod"],
["P", 1987, "red"],
["P2", 1990, "red"],
["P's", 2010, "darkred"],
["The", 2018, "darkred"],
["Prey", 2022, "darkred"],
["Prom.", 2012, "darkgreen"],
]
plot(films)
```
## Canon Timeline
- Through properly *Prometheus* is mutually exclusive all *Predator* films.
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["Prom.", 2012, "darkgreen"],
]
plot(films)
```
## More
- Unfortunately, *Prometheus* was subject to fairly extreme studio interference (as was *Alien* but not quite as grievously).
- The fallout led to the interesting but imperfect *Alien: Covenant* also from Scott and *Alien: Romulus* (from director Fede Álvarez).
- Each featured leading roles for women, "Alien" in title, and a canon conflict with *Predator* films.
## Canon Timeline
- We use the "canon" timeline just for clarity.
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["Prom.", 2012, "darkgreen"],
["Cov.", 2017, "darkgreen"],
["Rom.", 2024, "darkgreen"],
]
plot(films)
```
## Full Timeline
- And the full timeline.
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["Prom.", 2012, "darkgreen"],
["Cov.", 2017, "darkgreen"],
["Rom.", 2024, "darkgreen"],
["AVP", 2004, "darkgoldenrod"],
["AVPR", 2007, "darkgoldenrod"],
["P", 1987, "red"],
["P2", 1990, "red"],
["P's", 2010, "darkred"],
["The", 2018, "darkred"],
["Prey", 2022, "darkred"],
]
plot(films)
```
# Classification
## Background
> [Classification is a **supervised** machine learning technique used to predict labels or categories based on input data. The goal is to assign each data point to a predefined class, such as spam vs. non-spam emails or diseased vs. healthy patients.](https://www.geeksforgeeks.org/machine-learning/getting-started-with-classification/)
> [For example, a classification model might be trained on dataset of images labeled as either dogs or cats and it can be used to predict the class of new and unseen images as dogs or cats based on their features such as colour, texture or shape.](https://www.geeksforgeeks.org/machine-learning/getting-started-with-classification/)
## Supervision
- We recall supervision:
- Uses *labeled* data.
- Maps inputs to outputs.
- Goal: Predict outcomes.
- Examples: Classification, regression.
## Types of Classification
1. Binary Classification (Good vs. Bad)
2. Multi-class (*Alien* Quadrilogy vs. *Prometheus* Trilogy vs. AVP)
3. Multi-label (*Alien* vs. Sigourney vs. Ridley)
# Process
## Stages
1. Data Collection
2. Feature Extraction
3. Model Training
4. Iterate
1. Model Evaluation
2. Prediction
## Data Collection
- We start with some training data.
- For example, the films of the *Alien* franchise.
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["Prom.", 2012, "darkgreen"],
["Cov.", 2017, "darkgreen"],
["Rom.", 2024, "darkgreen"],
["AVP", 2004, "darkgoldenrod"],
["AVPR", 2007, "darkgoldenrod"],
["P", 1987, "red"],
["P2", 1990, "red"],
["P's", 2010, "darkred"],
["The", 2018, "darkred"],
["Prey", 2022, "darkred"],
]
plot(films)
```
## Feature Extraction
- We learn some information about the training data.
- For example, whether each film features Sigourney Weaver
```{python}
#| code-fold: true
films = [
["Alien", 1979],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["Prom.", 2012, "black"],
["Cov.", 2017, "black"],
["Rom.", 2024, "black"],
["AVP", 2004, "black"],
["AVPR", 2007, "black"],
["P", 1987, "black"],
["P2", 1990, "black"],
["P's", 2010, "black"],
["The", 2018, "black"],
["Prey", 2022, "black"],
]
plot(films)
```
## Aside: Features
> [In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a data set.](https://en.wikipedia.org/wiki/Feature_(machine_learning))
- Can be: Released before 2000.
- Can't be: Is art.
## Model Training
- Pretend you are a model.
- Here are the films I consider "good" in red.
- How can we "predict" if a film is good.
```{python}
#| code-fold: true
films = [
["Alien", 1979,"red"],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["Prom.", 2012,"red"],
["Cov.", 2017],
["Rom.", 2024,"red"],
["AVP", 2004],
["AVPR", 2007],
["P", 1987],
["P2", 1990],
["P's", 2010],
["The", 2018],
["Prey", 2022,"red"],
]
plot(films)
```
## Model Evaluation
- Imagine we use "Directed by Ridley Scott" as our model for a good film.
- Hits in red, mispredicts in blue, missed predicts in purple, correct omits in green.
```{python}
#| code-fold: true
films = [
["Alien", 1979,"red"],
["A's", 1986],
["A3", 1992],
["Res.", 1997],
["Prom.", 2012,"red"],
["Cov.", 2017,"blue"],
["Rom.", 2024,"purple"],
["AVP", 2004],
["AVPR", 2007],
["P", 1987],
["P2", 1990],
["P's", 2010],
["The", 2018],
["Prey", 2022,"purple"],
]
plot(films)
```
## Prediction
- This was accurate until 2017!
- It wouldn't have been a very good predicting model!
## Iteration
- In 2017
- Studios pushed Ridley Scott to release a middling *Alien: Covenant*
- Out of nowhere, *Prey* was an incredible film in the *Predator* franchise.
- And the sole *Predator* film not in cannon conflict with *Prometheus*
- *Alien: Romulus* was a superb *Alien* film, and the first in the franchise not directed by Ridley Scott
## Iteration
- However!
- *Alien: Covenant* was Ridley Scott's first direct sequel.
- *Prey* is the sole *Predator* film to feature a woman in a leading role.
- *Alien: Romulus* was the second standalone preequel - after *Prometheus*.
- It comes down to feature selection!
##
|Film|Woman-led|Canon|Not a direct sequel|*"Good"*|
|-|-|-|-|-|
|Alien|✓|✓|✓|**✓**|
|Aliens|✓|✓|||
|Predator|||✓||
|AVP|||✓||
|Prometheus|✓|✓|✓|**✓**|
|Covenant|✓|✓|||
|Prey|✓|✓|✓|**✓**|
|Romulus|✓|✓|✓|**✓**|
## Simple Example
- It's not always complicated.
- How to capture the *Alien* original quadrilogy?
- Stars Sigourney Weaver.
- Easy!
##
|Film|Weaver|*Alien* in name|*"Original"*|
|-|-|-|-|-|
|Alien|✓|✓|**✓**|
|Alien 2[^1]||✓||
|Aliens|✓|✓|**✓**|
|Predator|||
|AVP||✓||
|Romulus||✓||
|Avatar[^2]|✓||
[^1]: *Alien 2: On Earth* was an unauthorized 1980 "sequel".
[^2]: Some speculated Dir. J. Cameron (*Aliens* and *Avatar*) set both in the some universe.