Learning

AI 101

Setting the Stage

  • We’ve explored the perceptron
  • We’ve used binary classification to different odds and evens.
  • We approach multi-class classification.
  • And finally… we placed the entire framework within a matrix.

Setup

import numpy as np
one = [0,0,0,0,1,0,0,0,0]
two = [0,0,1,0,0,0,1,0,0]
thr = [0,0,1,0,1,0,1,0,0]
fou = [1,0,1,0,0,0,1,0,1]
fiv = [1,0,1,0,1,0,1,0,1]
six = [1,0,1,1,0,1,1,0,1]
dice = np.array([one, two, thr, fou, fiv, six])

Recall

Representing Reality

  • How does a computer “see” a simple object?
  • Let’s take a standard six-sided die.
  • We can represent the face of a die as a 3x3 grid.
  • Or, alternatively, a 1x9 (or 9x1) “vector”

NumPy

  • We used NumPy to perform element-wise multiplication over vectors.
  • This is also called the “Hadamard product” or “element-wise product”.
  • We step through the example from the lab, briefly.

Is One

  • Last week, we found it was easy enough to classify one.
  • Briefly, we looked for a center dot with a positive weight, and gave everything else a negative weight.
  • Then we multiplied “element-wise” the visual data (a vector of length 9) by the weights (a vector of length 9).

\[ \sum_{i=1}^{n} w_i \times x_i \]

Is One Vector

  • We used, perhaps, this “is one?” vector.
    • Recall, we cannot include spaces in our names of things in Colab.
    • We use “single equals assignment” to assign a variable to some name.
    • In this case, the variable is the “is one?” vector.
is_one_vector = np.array([-1,-1,-1,-1,1,-1,-1,-1,-1])

Compare

  • Find out which die we have in a single line using vector arithmetic.
0 < np.sum(dice * is_one_vector, 1)
array([ True, False, False, False, False, False])

Matrices

  • Imagine that instead of having six dice that we want to check to see if they are one.
  • We instead have one die and we wish to see what value it represents.
    • We represent the die as a 1 by 9 (or 9 by 1) vector
    • We represent its value (1 to 6) as a 1 by 6 (or 6 by 1) vector.
    • We can perform a single matrix multiplication using a 9 by 6 to get the result.

For example

  • We wish to find out what “class” (what number) fiv belongs to.
  • We would start with fiv
fiv = np.array(fiv)
fiv
array([1, 0, 1, 0, 1, 0, 1, 0, 1])
  • We would want to get back something with a 1 in the fiveth position, and zeroes elsewhere.
[0, 0, 0, 0, 1, 0,]
[0, 0, 0, 0, 1, 0]

That matrix

  • This matrix is the precise matrix made in the lab.
multi = np.array([
    [-1/1, -1/1, -1/1, -1/1,  1/1, -1/1, -1/1, -1/1, -1/1],
    [-1/2, -1/2,  1/2, -1/2, -1/2, -1/2,  1/2, -1/2, -1/2],
    [-1/3, -1/3,  1/3, -1/3,  1/3, -1/3,  1/3, -1/3, -1/3],
    [ 1/4, -1/4,  1/4, -1/4, -1/4, -1/4,  1/4, -1/4,  1/4],
    [ 1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5],
    [ 0/6,  3/6,  0/6,  3/6, -6/6,  3/6,  0/6,  3/6,  0/6],
])

Compare to 1

  • Once again, we can simply by just seeing if there’s enough “weight” to make a neuron fire, or not.
1 <= np.sum(fiv * multi,1)
array([False, False, False, False,  True, False])
  • Only the neuron representing “five” will fire!

Step Back

Collect Our Thoughts

  • Is this intelligent:
print(multi)
[[-1.         -1.         -1.         -1.          1.         -1.
  -1.         -1.         -1.        ]
 [-0.5        -0.5         0.5        -0.5        -0.5        -0.5
   0.5        -0.5        -0.5       ]
 [-0.33333333 -0.33333333  0.33333333 -0.33333333  0.33333333 -0.33333333
   0.33333333 -0.33333333 -0.33333333]
 [ 0.25       -0.25        0.25       -0.25       -0.25       -0.25
   0.25       -0.25        0.25      ]
 [ 0.2        -0.2         0.2        -0.2         0.2        -0.2
   0.2        -0.2         0.2       ]
 [ 0.          0.5         0.          0.5        -1.          0.5
   0.          0.5         0.        ]]

Our Task

  • We were trying to do a computer vision task.
  • Specifically, we wished to classify dice.

Snake eyes dice

Encoding

  • To do so, we recognized we could view dice with as few as nice (9) “sensory neurons.
1 2 3
4 5 6
7 8 9
  • Each of these is either 0 or 1 or perhaps True or False etc.

Vectors

  • We then recognized we could place these in any order, and in a single “dimension”.
    • We termed this a vector.
1 2 3 4 5 6 7 8 9

Graphs

  • We then discussed that these vectors could be treated as part of a graph.
    • The top, input layer representing the visual data.
    • The bottom, output layer representing the classification

SudokuBipartite cluster_cells C22 (2,2) D06 6 C22--D06 D05 5 C22--D05 D04 4 C22--D04 D03 3 C22--D03 D02 2 C22--D02 D01 1 C22--D01 C21 (2,1) C21--D06 C21--D05 C21--D04 C21--D03 C21--D02 C21--D01 C20 (2,0) C20--D06 C20--D05 C20--D04 C20--D03 C20--D02 C20--D01 C12 (1,2) C12--D06 C12--D05 C12--D04 C12--D03 C12--D02 C12--D01 C11 (1,1) C11--D06 C11--D05 C11--D04 C11--D03 C11--D02 C11--D01 C10 (1,0) C10--D06 C10--D05 C10--D04 C10--D03 C10--D02 C10--D01 C02 (0,2) C02--D06 C02--D05 C02--D04 C02--D03 C02--D02 C02--D01 C01 (0,1) C01--D06 C01--D05 C01--D04 C01--D03 C01--D02 C01--D01 C00 (0,0) C00--D06 C00--D05 C00--D04 C00--D03 C00--D02 C00--D01

Edges

  • Edges in this graph could connect sensed dots to numeric meaning.
    • Middle square to odd numbers.

SudokuBipartite cluster_cells C22 (2,2) D06 6 C22--D06 D05 5 C22--D05 D04 4 C22--D04 D03 3 C22--D03 D02 2 C22--D02 D01 1 C22--D01 C21 (2,1) C21--D06 C21--D05 C21--D04 C21--D03 C21--D02 C21--D01 C20 (2,0) C20--D06 C20--D05 C20--D04 C20--D03 C20--D02 C20--D01 C12 (1,2) C12--D06 C12--D05 C12--D04 C12--D03 C12--D02 C12--D01 C11 (1,1) C11--D06 C11--D05 C11--D05 C11--D04 C11--D03 C11--D03 C11--D02 C11--D01 C11--D01 C10 (1,0) C10--D06 C10--D05 C10--D04 C10--D03 C10--D02 C10--D01 C02 (0,2) C02--D06 C02--D05 C02--D04 C02--D03 C02--D02 C02--D01 C01 (0,1) C01--D06 C01--D05 C01--D04 C01--D03 C01--D02 C01--D01 C00 (0,0) C00--D06 C00--D05 C00--D04 C00--D03 C00--D02 C00--D01

Weights

  • Some edges may matter more than others.
    • Center is very important for 1
    • Kinda important for 3 and 5 - where corners matter too.

SudokuBipartite cluster_cells C22 (2,2) D05 5 C22--D05 C20 (2,0) C20--D05 D03 3 C20--D03 C11 (1,1) C11--D05 C11--D03 D01 1 C11--D01 C02 (0,2) C02--D05 C02--D03 C00 (0,0) C00--D05

Matrices

  • Then, we note that we can express these weights in a matrix.
multi = np.array([
    [-1/1, -1/1, -1/1, -1/1,  1/1, -1/1, -1/1, -1/1, -1/1],
    [-1/2, -1/2,  1/2, -1/2, -1/2, -1/2,  1/2, -1/2, -1/2],
    [-1/3, -1/3,  1/3, -1/3,  1/3, -1/3,  1/3, -1/3, -1/3],
    [ 1/4, -1/4,  1/4, -1/4, -1/4, -1/4,  1/4, -1/4,  1/4],
    [ 1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5],
    [ 0/6,  3/6,  0/6,  3/6, -6/6,  3/6,  0/6,  3/6,  0/6],
])

Simpler

  • Trying to make this easier to see and think about.
multi = np.array([
    [ 0,  0,  0,  0,  1,  0,  0,  0,  0],
    ...
    [ 0,  0, .3,  0, .3,  0, .3,  0,  0],
    ...
    [.2,  0, .2,  0, .2,  0, .2,  0, .2],
    ...
])

Learning

My claim

  • I claim this is not intelligent, though it can do an intelligent task (vision).
multi = np.array([
    [-1/1, -1/1, -1/1, -1/1,  1/1, -1/1, -1/1, -1/1, -1/1],
    [-1/2, -1/2,  1/2, -1/2, -1/2, -1/2,  1/2, -1/2, -1/2],
    [-1/3, -1/3,  1/3, -1/3,  1/3, -1/3,  1/3, -1/3, -1/3],
    [ 1/4, -1/4,  1/4, -1/4, -1/4, -1/4,  1/4, -1/4,  1/4],
    [ 1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5],
    [ 0/6,  3/6,  0/6,  3/6, -6/6,  3/6,  0/6,  3/6,  0/6],
])
  • Rather, I am intelligent, since I made it.

My claim

  • I do, however, claim, that this is pretty close to being intelligent:
1 <= np.sum(fiv * multi,1)
array([False, False, False, False,  True, False])
  • As a method, it could theoretically recognize anything.
    • Art
    • Love
    • Beauty
    • True

Generalize

  • The trouble is, how do we get the values in the matrix?
  • Somehow, to go from “just arithmetic” to intelligence, we need to learn.
  • This is where machine learning comes into artificial intelligence.

Setup

Our Goals

  • We will:
    • Start with “nothing”
      • An arbitrary matrix with random values.
    • Construct some learning process
      • We will represent this with Colab code, but the ideas are independent of how we write them.
    • Show that we can recognize dice.
      • And therefore (at least theoretically) anything

Example

  • We started with this:
multi = np.array([
    [-1/1, -1/1, -1/1, -1/1,  1/1, -1/1, -1/1, -1/1, -1/1],
    [-1/2, -1/2,  1/2, -1/2, -1/2, -1/2,  1/2, -1/2, -1/2],
    [-1/3, -1/3,  1/3, -1/3,  1/3, -1/3,  1/3, -1/3, -1/3],
    [ 1/4, -1/4,  1/4, -1/4, -1/4, -1/4,  1/4, -1/4,  1/4],
    [ 1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5],
    [ 0/6,  3/6,  0/6,  3/6, -6/6,  3/6,  0/6,  3/6,  0/6],
])

Size

  • As discussed, to look at:
    • 9 locations, and produce
    • 6 possible characterizations
  • We used a \(9 \times 6\) matrix.
multi.shape
(6, 9)

Randomize

  • Rather than using our own intelligence to create the matrix…
  • …we can randomize everything.
  • Imagine, say, when we first try to understand something.
    • About the same as random guessing.
  • We use np.random.rand and tell it what shape we want.
learn = np.random.rand(6,9)

View it

  • It looks like this:
    • All between 0 and 1 by default.
print(learn)
[[0.13819755 0.858514   0.89387389 0.97953215 0.95078702 0.91339183
  0.55505456 0.05400497 0.76328784]
 [0.61992175 0.60798114 0.47795787 0.99881325 0.21490773 0.1434384
  0.24203858 0.96896522 0.44601858]
 [0.32932856 0.81046388 0.98930022 0.72729313 0.19500454 0.34640125
  0.48504912 0.72482006 0.13982303]
 [0.86507991 0.15626278 0.73353581 0.0778394  0.23396891 0.53146514
  0.62610878 0.30177633 0.65322222]
 [0.30151646 0.17771519 0.31744043 0.70558929 0.92332405 0.86750924
  0.06348737 0.10571363 0.8207625 ]
 [0.18535997 0.81312843 0.86835174 0.80318644 0.5531322  0.4545896
  0.96877004 0.93137333 0.08946593]]

Check it out!

  • Mine:
for die in dice:
    print(1 <= np.sum(die * multi,1))
[ True False False False False False]
[False  True False False False False]
[False False  True False False False]
[False False False  True False False]
[False False False False  True False]
[False False False False False  True]
  • Random:
for die in dice:
    print(1 <= np.sum(die * learn,1))
[False False False False False False]
[ True False  True  True False  True]
[ True False  True  True  True  True]
[ True  True  True  True  True  True]
[ True  True  True  True  True  True]
[ True  True  True  True  True  True]

What happened?

  • Well, we multiplied each of the nine possible places for a dot in a die by something.
  • Actually, we did this six times - one for each possible classification
    • We classify a die as a “one” or a “three”
  • For each of these six classes, we summed up the nine products
    • The presence of dot multiplied by importance of dot, for all dots.

Examine the results

  • For each prediction, we can determine which row (or column) of the matrix made that prediction.
  • If that prediction is correct, we can “strengthen” the connections.
    • Perhaps increase them by 10% or by .1 or something.
  • If the prediction is incorrect, we can “penalize” the connections.
    • Perhaps decrease them by 10% or by .1 or something.

“Supervised” Learning

  • In this case, we engage in supervised learning.
    • We know the answer.
    • We can compare against the answer.
  • This is the answer:
for die in dice:
    print(1 <= np.sum(die * multi,1))
[ True False False False False False]
[False  True False False False False]
[False False  True False False False]
[False False False  True False False]
[False False False False  True False]
[False False False False False  True]

The key

  • I make an answer key quickly.
1 <= dice @ multi.transpose()
array([[ True, False, False, False, False, False],
       [False,  True, False, False, False, False],
       [False, False,  True, False, False, False],
       [False, False, False,  True, False, False],
       [False, False, False, False,  True, False],
       [False, False, False, False, False,  True]])
  • We “transpose” the key to make the dimensions line up.
  • @ is the special NumPy “operator” for matrix multiplication.
    • Multiple rows times columns, then sum them up.
key = 1 <= dice @ multi.transpose()

Compare

  • We can compare the answer key to the current results from random guessing.
  • We use np.equal to compare all of the matrix positions to see if they are equal.
np.equal(key, 1 <= dice @ learn.transpose())
array([[False,  True,  True,  True,  True,  True],
       [False, False, False, False,  True, False],
       [False,  True,  True, False, False, False],
       [False, False, False,  True, False, False],
       [False, False, False, False,  True, False],
       [False, False, False, False, False,  True]])
  • We want all of these to be true - the learned matrix should be just as good as our own intelligence!

Insight

  • We note that is often the case that there are many correct guesses for one.
    • After all, it is unlikely that the randomization produces values high enough to sum to one given a single dot.
  • We note that is often the case that there are many incorrect guesses for six.
    • After all, it is unlikely that the randomization produces values high enough to sum to six given a single dot.

Problem

  • Just for this class and to make things easier, we modify the dice so they all sum to one.
    • So, six, which previously had six dots multiplied by one, now has 6 dots mulitiplied by \(\frac{1}{6}\) each.
  • We started with this:
dice
array([[0, 0, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 1, 0, 0],
       [0, 0, 1, 0, 1, 0, 1, 0, 0],
       [1, 0, 1, 0, 0, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 1, 0, 1, 1, 0, 1]])

Solution

  • I just want to divide each die by how many dots it has.
  • I frustrating have to transpose there and back to get the multiplication to work…
divider = np.array([1,2,3,4,5,6]).reshape(1,6)
fair_dice = (dice.transpose() / divider).transpose()
fair_dice
array([[0.        , 0.        , 0.        , 0.        , 1.        ,
        0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.5       , 0.        , 0.        ,
        0.        , 0.5       , 0.        , 0.        ],
       [0.        , 0.        , 0.33333333, 0.        , 0.33333333,
        0.        , 0.33333333, 0.        , 0.        ],
       [0.25      , 0.        , 0.25      , 0.        , 0.        ,
        0.        , 0.25      , 0.        , 0.25      ],
       [0.2       , 0.        , 0.2       , 0.        , 0.2       ,
        0.        , 0.2       , 0.        , 0.2       ],
       [0.16666667, 0.        , 0.16666667, 0.16666667, 0.        ,
        0.16666667, 0.16666667, 0.        , 0.16666667]])

Check it now

  • These should make the random guessing more, well, random…
np.equal(key, 1 <= fair_dice @ learn.transpose())
array([[False,  True,  True,  True,  True,  True],
       [ True, False,  True,  True,  True,  True],
       [ True,  True, False,  True,  True,  True],
       [ True,  True,  True, False,  True,  True],
       [ True,  True,  True,  True, False,  True],
       [ True,  True,  True,  True,  True, False]])

Learning

We’re ready

  • Now we are ready to go from nothing (random guessing) to something (intelligence)
  • We will go over every classification.
    • If it is the same as the answer key, we will “reward” the row - increasing its weights.
    • If it differs, we will decrease the weights.
  • I will do this by 10%.

Writing it out

  • No real way around using code for this as far as I know.
  • First, we’ll make a copy of our original random thing to compare against.
  • Then we’ll loop - using for - over all the classifications and update accordingly.
backup = learn.copy()

Steps

  • Check the answer.
    • Print them for now, to see them.
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
print(grade)
[[False  True  True  True  True  True]
 [ True False  True  True  True  True]
 [ True  True False  True  True  True]
 [ True  True  True False  True  True]
 [ True  True  True  True False  True]
 [ True  True  True  True  True False]]

Steps

  • Check the answer.
  • Loop over rows
    • Print them for now, to see them.
    • The first row shows how the “one” die is classified.
    • The first entry in each row shows whether a die is classified as a “one”
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
for row in grade:
    print(row)
[False  True  True  True  True  True]
[ True False  True  True  True  True]
[ True  True False  True  True  True]
[ True  True  True False  True  True]
[ True  True  True  True False  True]
[ True  True  True  True  True False]

Steps

  • Check the answer.
  • Loop over rows
  • Loop over correctnesses.
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
for row in grade:
    for correct in row:
        print(correct)
False
True
True
True
True
True
True
False
True
True
True
True
True
True
False
True
True
True
True
True
True
False
True
True
True
True
True
True
False
True
True
True
True
True
True
False

Steps

  • Check the answer.
  • Loop over rows
  • Loop over correctnesses.
  • If correct…
    • Increase… something?
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
for row in grade:
    for correct in row:
        if correct:
            print("What goes here?")
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?
What goes here?

We know

  • If a guess in a certain row is correct:
    • It was predicted by something.
    • That something was the random guesses.
    • But not all of the random guesses…
    • Just values that are multiplied to get that row.
  • So we want to increase (or decrease) the row at the same address.
  • How?

Enumerate

  • We can use the special enumerate function to get back a row and its location with in the matrix.
for loc, row in enumerate(grade):
    print("Location is", loc, "and row is", row)
Location is 0 and row is [False  True  True  True  True  True]
Location is 1 and row is [ True False  True  True  True  True]
Location is 2 and row is [ True  True False  True  True  True]
Location is 3 and row is [ True  True  True False  True  True]
Location is 4 and row is [ True  True  True  True False  True]
Location is 5 and row is [ True  True  True  True  True False]

Enumerate

  • We can use the special [] notation to look up the part of the learning matrix that corresponds to the same row.
for loc, row in enumerate(grade):
    print("Location is", loc, "and row is", row, "and weights are", learn[loc])
Location is 0 and row is [False  True  True  True  True  True] and weights are [0.13819755 0.858514   0.89387389 0.97953215 0.95078702 0.91339183
 0.55505456 0.05400497 0.76328784]
Location is 1 and row is [ True False  True  True  True  True] and weights are [0.61992175 0.60798114 0.47795787 0.99881325 0.21490773 0.1434384
 0.24203858 0.96896522 0.44601858]
Location is 2 and row is [ True  True False  True  True  True] and weights are [0.32932856 0.81046388 0.98930022 0.72729313 0.19500454 0.34640125
 0.48504912 0.72482006 0.13982303]
Location is 3 and row is [ True  True  True False  True  True] and weights are [0.86507991 0.15626278 0.73353581 0.0778394  0.23396891 0.53146514
 0.62610878 0.30177633 0.65322222]
Location is 4 and row is [ True  True  True  True False  True] and weights are [0.30151646 0.17771519 0.31744043 0.70558929 0.92332405 0.86750924
 0.06348737 0.10571363 0.8207625 ]
Location is 5 and row is [ True  True  True  True  True False] and weights are [0.18535997 0.81312843 0.86835174 0.80318644 0.5531322  0.4545896
 0.96877004 0.93137333 0.08946593]

Steps

  • Check the answer.
  • Loop over rows
  • Loop over correctnesses.
  • If correct…
    • Increase that row.
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
for loc, row in enumerate(grade):
    for correct in row:
        if correct:
            learn[loc] * 1.1

Check it out

  • Old (“backup”)
np.equal(key, 1 <= fair_dice @ backup.transpose())
array([[False,  True,  True,  True,  True,  True],
       [ True, False,  True,  True,  True,  True],
       [ True,  True, False,  True,  True,  True],
       [ True,  True,  True, False,  True,  True],
       [ True,  True,  True,  True, False,  True],
       [ True,  True,  True,  True,  True, False]])
  • New (“learn”)
np.equal(key, 1 <= fair_dice @ learn.transpose())
array([[False,  True,  True,  True,  True,  True],
       [ True, False,  True,  True,  True,  True],
       [ True,  True, False,  True,  True,  True],
       [ True,  True,  True, False,  True,  True],
       [ True,  True,  True,  True, False,  True],
       [ True,  True,  True,  True,  True, False]])

Not much better

  • We should do more.
  • But first, we begin again from our backup.
  • We don’t want to mix different experiments together.
learn = backup.copy()

Steps

  • Check the answer.
  • Loop over rows
  • Loop over correctnesses.
  • If correct…
    • Increase that row.
  • Otherwise…
    • Decrease that row.
learn = backup.copy()
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
for loc, row in enumerate(grade):
    for correct in row:
        if correct:
            learn[loc] = learn[loc] * 1.1
        else: 
            learn[loc] = learn[loc] * .9

Better?

(1 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 0, 0, 1, 0],
       [1, 0, 1, 0, 0, 1],
       [1, 0, 0, 0, 0, 1],
       [0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0]])
  • Recall we want a diagonal of 1 here.
  • (I use + 0 to turn into numbers so the array is smaller and easier to read)
  • (True + 0 is 1 and False + 0 is 0 - don’t worry about it)
  • Doesn’t look intelligent to me.

We recognize

  • We don’t need to increase or decrease entire rows
  • Rather, we know which of the nine dots has a nonzero value…
  • And therefore which of the nine weights is contributing to a classification.
  • So, rather than increase a row by 10%, increase or decrease the relevant weights only.

Steps

  • Check the answer.
  • Loop over rows
  • Loop over correctnesses.
  • If correct…
    • Increase relevant.
  • Otherwise…
    • Decrease relevant.
learn = backup.copy()
grade = np.equal(key, 1 <= fair_dice @ learn.transpose())
for loc, row in enumerate(grade):
    for correct in row:
        if correct:
            learn[loc] = learn[loc] + dice[loc]
        else: 
            learn[loc] = learn[loc] - dice[loc]
(1 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 1, 1]])
  • Recall we want a diagonal of 1 here.

One Problem

  • We may guess correctly too often.
    • Given two equally likely choices, 50% chance.
    • But we only want one classification of six - ~17% chance.
  • We can bias against guessing by… increasing the bias!
(3 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1],
       [0, 0, 0, 1, 1, 1],
       [0, 0, 0, 1, 1, 1],
       [0, 0, 0, 1, 1, 1]])
(4 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 1, 1],
       [0, 0, 1, 0, 1, 0],
       [0, 0, 0, 1, 1, 1],
       [0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 1]])
(5 <= fair_dice @ learn.transpose()) + 0
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0]])

Looking good?

  • This looked awfully good to me.
(4 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 1, 1],
       [0, 0, 1, 0, 1, 0],
       [0, 0, 0, 1, 1, 1],
       [0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 1]])
  • If this was an exam the percentage would be a…
compare_to_key = np.equal(4 <= fair_dice @ learn.transpose(), key)
np.count_nonzero(compare_to_key) / compare_to_key.size
0.75

Compare

  • Initially…
compare_to_random = np.equal(1 <= fair_dice @ backup.transpose(), key)
print(np.count_nonzero(compare_to_random) / compare_to_random.size)
0.8333333333333334
  • After learning…
compare_to_key = np.equal(4 <= fair_dice @ learn.transpose(), key)
print(np.count_nonzero(compare_to_key) / compare_to_key.size)
0.75
  • Wait… isn’t that worse?

Precision and Recall

  • Precision and recall are then defined as:

\[ \begin{align} \text{Precision} &= \frac{tp}{tp + fp} \\ \text{Recall} &= \frac{tp}{tp + fn} \, \end{align} \]

  • Precision is the number of true positives among the predicted positives.
  • Recall is the number of true positives among the actual positives.

True Positive Detection

  • Initially…
guesses = 1 <= fair_dice @ backup.transpose()
true_positive = 0
for loc, row in enumerate(guesses):
    if row[loc]:
        true_positive = true_positive + 1
print(true_positive / 6)
0.0
  • After learning…
guesses = 4 <= fair_dice @ learn.transpose()
true_positive = 0
for loc, row in enumerate(guesses):
    if row[loc]:
        true_positive = true_positive + 1
print(true_positive / 6)
1.0
  • Count along the diagonal.
  • We go from zero to all true positives.

How?

  • Initially, randomization simply predicted no classes at all for anything.
1 <= fair_dice @ backup.transpose()
array([[False, False, False, False, False, False],
       [False, False, False, False, False, False],
       [False, False, False, False, False, False],
       [False, False, False, False, False, False],
       [False, False, False, False, False, False],
       [False, False, False, False, False, False]])
  • Now, at least, we capture the diagonal and aren’t too far off from here.

More to Come

  • I think today was a lot.
  • We will play around with this a bit in the lab then return next week.
  • To reduce false positives

FIN

(3.5 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 1, 1],
       [0, 0, 1, 0, 1, 0],
       [0, 0, 0, 1, 1, 1],
       [0, 0, 0, 1, 1, 1],
       [0, 0, 0, 0, 0, 1]])
(4.0 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 1, 1],
       [0, 0, 1, 0, 1, 0],
       [0, 0, 0, 1, 1, 1],
       [0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 1]])
(4.5 <= fair_dice @ learn.transpose()) + 0
array([[1, 0, 0, 0, 1, 0],
       [0, 0, 1, 1, 0, 1],
       [0, 0, 1, 0, 0, 0],
       [0, 0, 0, 1, 0, 1],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1]])
  • Bias is not enough!