Falsify

AI 101

Background

Connections

In our lecture, we introduced the learning within our martrix framework. I think there was a lot of content in the lecture and would like for you to be able to go over it a bit, and try things out yourself.

Definitions

  • Today, we will work with code cells of Colab to create a multi-class perceptron than can correctly classify dice when run.

Setup

The Dice

  • You will need a code cell that includes at least the following:
import numpy as np
one = [0,0,0,0,1,0,0,0,0]
two = [0,0,1,0,0,0,1,0,0]
thr = [0,0,1,0,1,0,1,0,0]
fou = [1,0,1,0,0,0,1,0,1]
fiv = [1,0,1,0,1,0,1,0,1]
six = [1,0,1,1,0,1,1,0,1]
dice = np.array([one, two, thr, fou, fiv, six])
print(dice)
[[0 0 0 0 1 0 0 0 0]
 [0 0 1 0 0 0 1 0 0]
 [0 0 1 0 1 0 1 0 0]
 [1 0 1 0 0 0 1 0 1]
 [1 0 1 0 1 0 1 0 1]
 [1 0 1 1 0 1 1 0 1]]
  • You will probably also want “fair” dice.
    • Each row sums to one, even though…
    • Each row has a different number of non-zero values.
divider = np.array([1,2,3,4,5,6]).reshape(1,6)
fair_dice = (dice.transpose() / divider).transpose()
fair_dice
print(fair_dice)
[[0.         0.         0.         0.         1.         0.
  0.         0.         0.        ]
 [0.         0.         0.5        0.         0.         0.
  0.5        0.         0.        ]
 [0.         0.         0.33333333 0.         0.33333333 0.
  0.33333333 0.         0.        ]
 [0.25       0.         0.25       0.         0.         0.
  0.25       0.         0.25      ]
 [0.2        0.         0.2        0.         0.2        0.
  0.2        0.         0.2       ]
 [0.16666667 0.         0.16666667 0.16666667 0.         0.16666667
  0.16666667 0.         0.16666667]]
  • You should have some familiarity with what these code blocks do, even if you wouldn’t have written them yourself without them being provided.
    • Run them a few times.
    • Try to make some minor edits and predict what will change.
    • See if your are correct.
    • Get them working again.

The Neural Network

  • We made a neural network with random “edge weights”.
learn = np.random.rand(6,9)
  • We also made a backup copy so we could start over, easily, at any time.
backup = learn.copy()
  • It should look like this:
print(learn)
[[0.79905459 0.90807795 0.64882604 0.2891742  0.08023946 0.40492137
  0.68131149 0.29313589 0.43091928]
 [0.08478128 0.81795651 0.23087639 0.65886414 0.97435858 0.98247183
  0.59866117 0.29200513 0.76870662]
 [0.37112311 0.96460858 0.79185725 0.60330096 0.13559981 0.93244911
  0.67072557 0.07258409 0.89131832]
 [0.26224806 0.32607417 0.87612109 0.37124917 0.189376   0.11007376
  0.88541619 0.12534034 0.36361595]
 [0.21851785 0.30719779 0.30366815 0.37589443 0.12214581 0.68600188
  0.8202339  0.72848908 0.09124466]
 [0.27977661 0.36640022 0.54602639 0.49274091 0.22379799 0.72878082
  0.27838677 0.82448433 0.07645653]]

Supervision

  • We compared against an answer key, which was just a diagonal.
    • This is the superviser in supervised learning.
      • It knows the answers.
      • It can provide feedback to the learning framework.
key = np.array([
       [ True, False, False, False, False, False],
       [False,  True, False, False, False, False],
       [False, False,  True, False, False, False],
       [False, False, False,  True, False, False],
       [False, False, False, False,  True, False],
       [False, False, False, False, False,  True]
])

Learning

  • Learning occured in a few stages.
  • First, we also made sure to start with a fresh “learn” matrix, by copying from the backup.
learn = backup.copy()
print(learn)
[[0.79905459 0.90807795 0.64882604 0.2891742  0.08023946 0.40492137
  0.68131149 0.29313589 0.43091928]
 [0.08478128 0.81795651 0.23087639 0.65886414 0.97435858 0.98247183
  0.59866117 0.29200513 0.76870662]
 [0.37112311 0.96460858 0.79185725 0.60330096 0.13559981 0.93244911
  0.67072557 0.07258409 0.89131832]
 [0.26224806 0.32607417 0.87612109 0.37124917 0.189376   0.11007376
  0.88541619 0.12534034 0.36361595]
 [0.21851785 0.30719779 0.30366815 0.37589443 0.12214581 0.68600188
  0.8202339  0.72848908 0.09124466]
 [0.27977661 0.36640022 0.54602639 0.49274091 0.22379799 0.72878082
  0.27838677 0.82448433 0.07645653]]
  • Then, we would compare against the answer key to see how accurate the current neural network was at predicting dice.
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
print(grade)
[[False  True  True  True  True  True]
 [ True False  True  True  True  True]
 [ True  True False  True  True  True]
 [ True  True  True False  True  True]
 [ True  True  True  True False  True]
 [ True  True  True  True  True False]]
  • Then, we used a loop.
    • For every dice,
      • For every value it could predict, correct or otherwise.
        • If correct, reward the row with increased weight.
        • Else, penalize the row with reduced weight.
for loc, row in enumerate(grade):
    for correct in row:
        if correct:
            learn[loc] = learn[loc] + dice[loc]
        else: 
            learn[loc] = learn[loc] - dice[loc]
  • Finally, we can see how well we did.
    • Recall, we want a diagonal here.
(1 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 1, 1]])

The Problem

In Lecture

  • In lecture we dealt with the following problem:
    • We could predict “no” for all die to be any number.
    • This was highly accurate (83%).
    • This was highly useless (captured 0% of true positives).
    • Any learning attempt was complicated somehow.

The “Bias”

  • Even with negative weights, we have a problem: The Threshold.
  • We need a way to decide exactly when the sum is “enough” to trigger a fire.
  • We need to shift the “goalposts” without rewriting all our weights.

\(b\)

  • The Bias is a number we add to the sum before deciding to fire.
  • It represents how “easy” it is to make the neuron fire.
  • High Bias: The neuron is “trigger happy” and fires easily.
    • My “movie good” neuron is highly biased toward Ridley Scott films.
  • Negative Bias: The neuron is “stubborn” and needs a very high positive sum to fire.
    • My “movie good” neuron is negatively biased against Predator films.

The Completed Formula

  • Include the bias term \(b\):

\[\sum_{i=1}^{n} w_i x_i + b \]

  • If the result is \(> 0\), the neuron fires (1).
  • If the result is \(\le 0\), it stays silent (0).

Visualizing the Bias

  • In a graph, the bias is often shown as a special input node that is always 1, multiplied by its own weight \(b\).

PerceptronWithBias cluster_input Inputs cluster_bias Bias X1 X1 Neuron X1--Neuron w1 X2 X2 X2--Neuron w2 X3 X3 X3--Neuron w3 B 1 B--Neuron b

Trying it out

  • We tried different biases with our learning framework.
for bias in [3.5,4,4.5]:
    print("Bias of ", bias)
    print((bias <= fair_dice @ learn.transpose()) + 0)
Bias of  3.5
[[1 0 1 0 1 0]
 [0 1 1 1 1 1]
 [0 0 1 0 1 0]
 [0 0 0 1 1 1]
 [0 0 0 1 1 0]
 [0 0 0 0 0 1]]
Bias of  4
[[1 0 1 0 1 0]
 [0 1 1 1 1 1]
 [0 0 1 0 1 0]
 [0 0 0 1 1 1]
 [0 0 0 0 1 0]
 [0 0 0 0 0 1]]
Bias of  4.5
[[0 0 0 0 0 0]
 [0 0 1 1 1 0]
 [0 0 1 0 0 0]
 [0 0 0 1 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]

The Solution

  • Or part of it.

Previously…

  • Before we messed with the bias, we modified how learning took place at all.
  • Specifically, we increased or decreased all values in a row by some fixed number.
    • Versus adding or subtracting from the relevant edges.
  • It looked like this.
for loc, row in enumerate(grade):
    for correct in row:
        if correct:
            learn[loc] = learn[loc] * 1.1
        else: 
            learn[loc] = learn[loc] * .9
  • Instead of this:
for loc, row in enumerate(grade):
    for correct in row:
        if correct:
            learn[loc] = learn[loc] + dice[loc]
        else: 
            learn[loc] = learn[loc] - dice[loc]

Mathematically

  • We can express this mathematically by considering:
    • Old weights \(w\)
    • New weights \(w`\) (w “prime”)
    • Dice value \(d\)
  • Initially, we multiplied:

\[ w` = n \times w \]

  • Later, we added:

\[ w` = w + d \]

  • We note these are both linear transformations.
    • They create lines.
    • Say, imagine \(n = 2\) and \(d = 5\)

Your Task

Steps

  • Construct a series of additions or multiplications.
  • They may be by scalars
    • Like 1.1 or 2
  • They may be by vectors
    • Like [0,0,0,0,1,0,0,0,0]
  • You do not have to do the same kind of transfer for correct or incorrect.
    • You could scalar add in one place and vector multiply in another.
  • As a bonus, you may also change the bias however you like.
  • See how close you can get to achieving a diagonal, and!
    • Don’t forget to check your code more than once
    • It is randomized, and you don’t want to just get lucky.

Final Product

  • You should have a Colab document which, when run:
    • Creates a random matrix.
    • Modifies that random matrix to better detect dice.
    • Captures 6 true positives.
    • Has the minimum number of false positives possible.
    • Has a note of 100-200 words explaining what changes you made, and why.
      • “I just tried something” is fine, but do state, e.g., what “something” means to you.