Matrix

AI 101

Recall

  • We’ve explored the perceptron
  • We’ve used binary classification to different odds and evens.
  • We approach multi-class classification using:
    • The perceptron lab “multi-class”
    • The “tree tree”

To Recognize

Representing Reality

  • How does a computer “see” a simple object?
  • Let’s take a standard six-sided die.
  • We can represent the face of a die as a 3x3 grid.
  • Or, alternatively, a 1x9 (or 9x1) “vector”

Terminology

  • We introduce the following terms:
    • Scalar
    • Array
    • Vector
    • Matrix
  • We have approached matrices previously, but will do so more completely now.

Scalar

  • A scalar is a “just a number”
7
  • However, when we use the term scalar we think of it not as just a number, but as a “zero-dimensional array”.
    • It is a single point, with no dimensionality.
    • No height, no width.

Array

  • Zero-dimensional what now?

In computer science, an array is a data structure consisting of a collection of elements (values or variables)… each identified by at least one array index or key

  • We additionally note that if you have an array of something, they must all be the same kind of things.

Same things

  • This is an ordered collection of elements but not an array.
[1, "red", print, [2]]
  • This is an array.
[1, 2, 3, 4]

Vector

  • A vector is an ordered collection, usually of numbers.
[0, 0, 0, 0, 1, 0, 0, 0, 0]
  • It is a one dimensional array.
    • It has some length.
    • But only length! It doesn’t also have height, for example.

Usefulness

  • We have used vectors in three ways now in the perceptron.
  • We have used a vector to represent visual data, like ⚂
[0,0,1,0,1,0,1,0,0]
  • Why is this of length 9?

Usefulness

  • We have used vectors in three ways now in the perceptron.
  • We have used a vector to represent importance, like which visual data matters when determining a die represents an odd value.
[0,0,0,0,1,0,0,0,0]
  • Why is this of length 9?

Usefulness

  • We have used vectors in three ways now in the perceptron.
  • We have used a vector to represent which class some visual data corresponds to, such as what value is represented by a die.
[0,0,1,0,0,0]
  • Why is this of length 6?

Matrix

  • Thus far, we have no really used matrices for much.
  • We could have used them to store the visual data of a die, but instead we used a vector.
  • In some ways it is easier, and in other ways harder!
  • Today we use matrices directly, and most importantl matrix multiplication.

NumPy

Previously

  • We previously saw our first use of NumPy in the lab.

The fundamental package for scientific computing with Python

Recall

  • We used NumPy to perform element-wise multiplication over vectors.
  • This is also called the “Hadamard product” or “element-wise product”.
  • We step through the example from the lab, briefly.

Is One

  • Last week, we found it was easy enough to classify one.
  • Briefly, we looked for a center dot with a positive weight, and gave everything else a negative weight.
  • Then we multiplied “element-wise” the visual data (a vector of length 9) by the weights (a vector of length 9).

\[ \sum_{i=1}^{n} w_i \times x_i \]

Is One Vector

  • We used, perhaps, this “is one?” vector.
    • Recall, we cannot include spaces in our names of things in Colab.
    • We use “single equals assignment” to assign a variable to some name.
    • In this case, the variable is the “is one?” vector.
is_one_vector = [-1,-1,-1,-1,1,-1,-1,-1,-1]

Explanation

  • If we are checking if some die represents the value 1, the only thing we care about is the center dot.
  • Well, that’s not entirely true - we care a lot that nothing other than the center dot is set.
  • So we apply a positive weight to the center dot, and negative weights to all other dots.
is_one_vector = [-1,-1,-1,-1,1,-1,-1,-1,-1]

Test it

  • In the lab, we were helpfully furnished with vectors representing all die.
one = [0,0,0,0,1,0,0,0,0]
two = [0,0,1,0,0,0,1,0,0]
thr = [0,0,1,0,1,0,1,0,0]
fou = [1,0,1,0,0,0,1,0,1]
fiv = [1,0,1,0,1,0,1,0,1]
six = [1,0,1,1,0,1,1,0,1]
  • We can check what happens when we multiply any of these - or all of these - by is_one_vector.

Pack it up

  • I quickly make another vector - a vector of vectors - that contains one through six
dice = [one, two, thr, fou, fiv, six]
  • We can take a look at it:
print(dice)
[[0, 0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 1, 0, 0], [0, 0, 1, 0, 1, 0, 1, 0, 0], [1, 0, 1, 0, 0, 0, 1, 0, 1], [1, 0, 1, 0, 1, 0, 1, 0, 1], [1, 0, 1, 1, 0, 1, 1, 0, 1]]

Look at it

  • We could use a for loop to multiply each die by is_one_vector
    • We need NumPy for this!
import numpy as np

for die in dice:
    print(np.array(die) * np.array(is_one_vector))
[0 0 0 0 1 0 0 0 0]
[ 0  0 -1  0  0  0 -1  0  0]
[ 0  0 -1  0  1  0 -1  0  0]
[-1  0 -1  0  0  0 -1  0 -1]
[-1  0 -1  0  1  0 -1  0 -1]
[-1  0 -1 -1  0 -1 -1  0 -1]

Sum it up

  • We can also take the sum over all values in a vector using sum
    • We already did import numpy as np so we don’t need to repeat that.
for die in dice:
    print(sum(np.array(die) * np.array(is_one_vector)))
1
-2
-1
-4
-3
-6

Piecewise/Indicator

  • Remember, neurons can only (1) fire or (2) not fire.
    • They cannot “-6”.
  • So we check if the sum is positive, or not.
for die in dice:
    print(0 < sum(np.array(die) * np.array(is_one_vector)))
True
False
False
False
False
False

For What?

  • Perhaps we regard that for loop as confusing.
    • I expect, for example, novice programmers to expect that e.g. name the for variable die might matter.
for definitely_not_a_die in dice:
    print(0 < sum(np.array(definitely_not_a_die) * np.array(is_one_vector)))
True
False
False
False
False
False

Instead

  • Instead we recognize that we have something that looks an awful lot like a matrix.
np.array(dice)
array([[0, 0, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 1, 0, 0],
       [0, 0, 1, 0, 1, 0, 1, 0, 0],
       [1, 0, 1, 0, 0, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 1, 0, 1, 1, 0, 1]])

NumPy Away!

  • And NumPy can multiple matrices by vectors.
np.array(dice) * np.array(is_one_vector)
array([[ 0,  0,  0,  0,  1,  0,  0,  0,  0],
       [ 0,  0, -1,  0,  0,  0, -1,  0,  0],
       [ 0,  0, -1,  0,  1,  0, -1,  0,  0],
       [-1,  0, -1,  0,  0,  0, -1,  0, -1],
       [-1,  0, -1,  0,  1,  0, -1,  0, -1],
       [-1,  0, -1, -1,  0, -1, -1,  0, -1]])

And sum

  • And NumPy can sum across either the rows or columns of a matrix…
np.sum(np.array(dice) * np.array(is_one_vector))
-15
  • We can specify the axis (vertical or horizontal) upon which we want to sum.
np.sum(np.array(dice) * np.array(is_one_vector), 1)
array([ 1, -2, -1, -4, -3, -6])
  • Oh wow - that’s what we wanted!

Elementwise

  • We can even use < or > on NumPy vectors.
np.sum(np.array(dice) * np.array(is_one_vector), 1) > 0
array([ True, False, False, False, False, False])
  • “Wow that is so cool” - Everyone

Matrices

Mechanics

  • What is happening here, internally.
np.array(dice) * np.array(is_one_vector)
array([[ 0,  0,  0,  0,  1,  0,  0,  0,  0],
       [ 0,  0, -1,  0,  0,  0, -1,  0,  0],
       [ 0,  0, -1,  0,  1,  0, -1,  0,  0],
       [-1,  0, -1,  0,  0,  0, -1,  0, -1],
       [-1,  0, -1,  0,  1,  0, -1,  0, -1],
       [-1,  0, -1, -1,  0, -1, -1,  0, -1]])
  • Let’s take it apart.

Dice

  • First, look at dice
    • I’ll convert it to a NumPy array so it looks the same.
np.array(dice)
array([[0, 0, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 1, 0, 0],
       [0, 0, 1, 0, 1, 0, 1, 0, 0],
       [1, 0, 1, 0, 0, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 1, 0, 1, 1, 0, 1]])
  • We note that this is a 6 by 9 two-dimensional array, or matrix.
np.array(dice).shape
(6, 9)

Is One?

  • Then at is_one_vector
np.array(is_one_vector)
array([-1, -1, -1, -1,  1, -1, -1, -1, -1])
  • We note this is a length 9 one-dimensional array, or vector.
np.array(is_one_vector).shape
(9,)

Annoyance

  • It is annoying to keep typing this:
np.array(is_one_vector)
array([-1, -1, -1, -1,  1, -1, -1, -1, -1])
  • I use “single equals assignment” to change to always use the NumPy version.
dice = np.array(dice)
is_one_vector = np.array(is_one_vector)
0 < np.sum(dice * is_one_vector, 1)
array([ True, False, False, False, False, False])

Result

  • We note that when we multiply, we get back a length 6 one-dimensional array, or vector.
np.sum(dice * is_one_vector, 1).shape
(6,)

So…

  • If we take a
    • 6 by 9, times
    • 9 by 1, equals
    • 6 by 1.
  • Or perhaps more generally
    • \(m\) by \(n\), times
    • \(n\) by \(p\), equals
    • \(m\) by \(p\)

Multi-class

Matrices

  • Imagine that instead of having six dice that we want to check to see if they are one.
  • We instead have one die and we wish to see what value it represents.
    • We represent the die as a 1 by 9 (or 9 by 1) vector
    • We represent its value (1 to 6) as a 1 by 6 (or 6 by 1) vector.
    • We can perform a single matrix multiplication using a 9 by 6 to get the result.

For example

  • We wish to find out what “class” (what number) fiv belongs to.
  • We would start with fiv
fiv = np.array(fiv)
fiv
array([1, 0, 1, 0, 1, 0, 1, 0, 1])
  • We would want to get back something with a 1 in the fiveth position, and zeroes elsewhere.
[0, 0, 0, 0, 1, 0,]
[0, 0, 0, 0, 1, 0]

Emphasis

  • We can take a vector of length \(m\) and get a vector of length \(n\) by using an intermediate multiplication by a matrix of size \(m\) by \(n\)

That matrix

  • This matrix is the precise matrix made in the lab.
multi = np.array([
    [-1/1, -1/1, -1/1, -1/1,  1/1, -1/1, -1/1, -1/1, -1/1],
    [-1/2, -1/2,  1/2, -1/2, -1/2, -1/2,  1/2, -1/2, -1/2],
    [-1/3, -1/3,  1/3, -1/3,  1/3, -1/3,  1/3, -1/3, -1/3],
    [ 1/4, -1/4,  1/4, -1/4, -1/4, -1/4,  1/4, -1/4,  1/4],
    [ 1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5],
    [ 0/6,  3/6,  0/6,  3/6, -6/6,  3/6,  0/6,  3/6,  0/6],
])

Test it

  • We can try fiv
fiv * multi
array([[-1.        , -0.        , -1.        , -0.        ,  1.        ,
        -0.        , -1.        , -0.        , -1.        ],
       [-0.5       , -0.        ,  0.5       , -0.        , -0.5       ,
        -0.        ,  0.5       , -0.        , -0.5       ],
       [-0.33333333, -0.        ,  0.33333333, -0.        ,  0.33333333,
        -0.        ,  0.33333333, -0.        , -0.33333333],
       [ 0.25      , -0.        ,  0.25      , -0.        , -0.25      ,
        -0.        ,  0.25      , -0.        ,  0.25      ],
       [ 0.2       , -0.        ,  0.2       , -0.        ,  0.2       ,
        -0.        ,  0.2       , -0.        ,  0.2       ],
       [ 0.        ,  0.        ,  0.        ,  0.        , -1.        ,
         0.        ,  0.        ,  0.        ,  0.        ]])
  • What a mess!

Sum it up

  • It’s hard for me to wrap my head around that many numbers!
    • Especially when they aren’t nice round numbers, since I was using fractions.
  • We’ll sum things up, as before.
np.sum(fiv * multi,1)
array([-3.        , -0.5       ,  0.33333333,  0.75      ,  1.        ,
       -1.        ])

Compare, now to 1

  • Once again, we can simply by just seeing if there’s enough “weight” to make a neuron fire, or not.
1 <= np.sum(fiv * multi,1)
array([False, False, False, False,  True, False])
  • Only the neuron representing “five” will fire!

All at once

  • We can check all of the dice.
for die in dice:
    print(die * multi)
[[-0.         -0.         -0.         -0.          1.         -0.
  -0.         -0.         -0.        ]
 [-0.         -0.          0.         -0.         -0.5        -0.
   0.         -0.         -0.        ]
 [-0.         -0.          0.         -0.          0.33333333 -0.
   0.         -0.         -0.        ]
 [ 0.         -0.          0.         -0.         -0.25       -0.
   0.         -0.          0.        ]
 [ 0.         -0.          0.         -0.          0.2        -0.
   0.         -0.          0.        ]
 [ 0.          0.          0.          0.         -1.          0.
   0.          0.          0.        ]]
[[-0.         -0.         -1.         -0.          0.         -0.
  -1.         -0.         -0.        ]
 [-0.         -0.          0.5        -0.         -0.         -0.
   0.5        -0.         -0.        ]
 [-0.         -0.          0.33333333 -0.          0.         -0.
   0.33333333 -0.         -0.        ]
 [ 0.         -0.          0.25       -0.         -0.         -0.
   0.25       -0.          0.        ]
 [ 0.         -0.          0.2        -0.          0.         -0.
   0.2        -0.          0.        ]
 [ 0.          0.          0.          0.         -0.          0.
   0.          0.          0.        ]]
[[-0.         -0.         -1.         -0.          1.         -0.
  -1.         -0.         -0.        ]
 [-0.         -0.          0.5        -0.         -0.5        -0.
   0.5        -0.         -0.        ]
 [-0.         -0.          0.33333333 -0.          0.33333333 -0.
   0.33333333 -0.         -0.        ]
 [ 0.         -0.          0.25       -0.         -0.25       -0.
   0.25       -0.          0.        ]
 [ 0.         -0.          0.2        -0.          0.2        -0.
   0.2        -0.          0.        ]
 [ 0.          0.          0.          0.         -1.          0.
   0.          0.          0.        ]]
[[-1.         -0.         -1.         -0.          0.         -0.
  -1.         -0.         -1.        ]
 [-0.5        -0.          0.5        -0.         -0.         -0.
   0.5        -0.         -0.5       ]
 [-0.33333333 -0.          0.33333333 -0.          0.         -0.
   0.33333333 -0.         -0.33333333]
 [ 0.25       -0.          0.25       -0.         -0.         -0.
   0.25       -0.          0.25      ]
 [ 0.2        -0.          0.2        -0.          0.         -0.
   0.2        -0.          0.2       ]
 [ 0.          0.          0.          0.         -0.          0.
   0.          0.          0.        ]]
[[-1.         -0.         -1.         -0.          1.         -0.
  -1.         -0.         -1.        ]
 [-0.5        -0.          0.5        -0.         -0.5        -0.
   0.5        -0.         -0.5       ]
 [-0.33333333 -0.          0.33333333 -0.          0.33333333 -0.
   0.33333333 -0.         -0.33333333]
 [ 0.25       -0.          0.25       -0.         -0.25       -0.
   0.25       -0.          0.25      ]
 [ 0.2        -0.          0.2        -0.          0.2        -0.
   0.2        -0.          0.2       ]
 [ 0.          0.          0.          0.         -1.          0.
   0.          0.          0.        ]]
[[-1.         -0.         -1.         -1.          0.         -1.
  -1.         -0.         -1.        ]
 [-0.5        -0.          0.5        -0.5        -0.         -0.5
   0.5        -0.         -0.5       ]
 [-0.33333333 -0.          0.33333333 -0.33333333  0.         -0.33333333
   0.33333333 -0.         -0.33333333]
 [ 0.25       -0.          0.25       -0.25       -0.         -0.25
   0.25       -0.          0.25      ]
 [ 0.2        -0.          0.2        -0.2         0.         -0.2
   0.2        -0.          0.2       ]
 [ 0.          0.          0.          0.5        -0.          0.5
   0.          0.          0.        ]]

Sum it up

  • We can sum again.
for die in dice:
    print(np.sum(die * multi,1))
[ 1.         -0.5         0.33333333 -0.25        0.2        -1.        ]
[-2.          1.          0.66666667  0.5         0.4         0.        ]
[-1.    0.5   1.    0.25  0.6  -1.  ]
[-4.   0.   0.   1.   0.8  0. ]
[-3.         -0.5         0.33333333  0.75        1.         -1.        ]
[-6.         -1.         -0.66666667  0.5         0.4         1.        ]

Compare

  • Compare versus… probably 1?
for die in dice:
    print(1 <= np.sum(die * multi,1))
[ True False False False False False]
[False  True False False False False]
[False False  True False False False]
[False False False  True False False]
[False False False False  True False]
[False False False False False  True]
  • That looks like a diagonal of True to me.
    • Which should be just what we want, since our dice were in order.

Step Back

Collect Our Thoughts

  • Is this intelligent:
print(multi)
[[-1.         -1.         -1.         -1.          1.         -1.
  -1.         -1.         -1.        ]
 [-0.5        -0.5         0.5        -0.5        -0.5        -0.5
   0.5        -0.5        -0.5       ]
 [-0.33333333 -0.33333333  0.33333333 -0.33333333  0.33333333 -0.33333333
   0.33333333 -0.33333333 -0.33333333]
 [ 0.25       -0.25        0.25       -0.25       -0.25       -0.25
   0.25       -0.25        0.25      ]
 [ 0.2        -0.2         0.2        -0.2         0.2        -0.2
   0.2        -0.2         0.2       ]
 [ 0.          0.5         0.          0.5        -1.          0.5
   0.          0.5         0.        ]]

Our Task

  • We were trying to do a computer vision task.
  • Specifically, we wished to classify dice.

Snake eyes dice

Encoding

  • To do so, we recognized we could view dice with as few as nice (9) “sensory neurons.
1 2 3
4 5 6
7 8 9
  • Each of these is either 0 or 1 or perhaps True or False etc.

Vectors

  • We then recognized we could place these in any order, and in a single “dimension”.
    • We termed this a vector.
1 2 3 4 5 6 7 8 9

Graphs

  • We then discussed that these vectors could be treated as part of a graph.
    • The top, input layer representing the visual data.
    • The bottom, output layer representing the classification

SudokuBipartite cluster_cells C22 (2,2) D06 6 C22--D06 D05 5 C22--D05 D04 4 C22--D04 D03 3 C22--D03 D02 2 C22--D02 D01 1 C22--D01 C21 (2,1) C21--D06 C21--D05 C21--D04 C21--D03 C21--D02 C21--D01 C20 (2,0) C20--D06 C20--D05 C20--D04 C20--D03 C20--D02 C20--D01 C12 (1,2) C12--D06 C12--D05 C12--D04 C12--D03 C12--D02 C12--D01 C11 (1,1) C11--D06 C11--D05 C11--D04 C11--D03 C11--D02 C11--D01 C10 (1,0) C10--D06 C10--D05 C10--D04 C10--D03 C10--D02 C10--D01 C02 (0,2) C02--D06 C02--D05 C02--D04 C02--D03 C02--D02 C02--D01 C01 (0,1) C01--D06 C01--D05 C01--D04 C01--D03 C01--D02 C01--D01 C00 (0,0) C00--D06 C00--D05 C00--D04 C00--D03 C00--D02 C00--D01

Edges

  • Edges in this graph could connect sensed dots to numeric meaning.
    • Middle square to odd numbers.

SudokuBipartite cluster_cells C22 (2,2) D06 6 C22--D06 D05 5 C22--D05 D04 4 C22--D04 D03 3 C22--D03 D02 2 C22--D02 D01 1 C22--D01 C21 (2,1) C21--D06 C21--D05 C21--D04 C21--D03 C21--D02 C21--D01 C20 (2,0) C20--D06 C20--D05 C20--D04 C20--D03 C20--D02 C20--D01 C12 (1,2) C12--D06 C12--D05 C12--D04 C12--D03 C12--D02 C12--D01 C11 (1,1) C11--D06 C11--D05 C11--D05 C11--D04 C11--D03 C11--D03 C11--D02 C11--D01 C11--D01 C10 (1,0) C10--D06 C10--D05 C10--D04 C10--D03 C10--D02 C10--D01 C02 (0,2) C02--D06 C02--D05 C02--D04 C02--D03 C02--D02 C02--D01 C01 (0,1) C01--D06 C01--D05 C01--D04 C01--D03 C01--D02 C01--D01 C00 (0,0) C00--D06 C00--D05 C00--D04 C00--D03 C00--D02 C00--D01

Weights

  • Some edges may matter more than others.
    • Center is very important for 1
    • Kinda important for 3 and 5 - where corners matter too.

SudokuBipartite cluster_cells C22 (2,2) D05 5 C22--D05 C20 (2,0) C20--D05 D03 3 C20--D03 C11 (1,1) C11--D05 C11--D03 D01 1 C11--D01 C02 (0,2) C02--D05 C02--D03 C00 (0,0) C00--D05

Matrices

  • Then, we note that we can express these weights in a matrix.
multi = np.array([
    [-1/1, -1/1, -1/1, -1/1,  1/1, -1/1, -1/1, -1/1, -1/1],
    [-1/2, -1/2,  1/2, -1/2, -1/2, -1/2,  1/2, -1/2, -1/2],
    [-1/3, -1/3,  1/3, -1/3,  1/3, -1/3,  1/3, -1/3, -1/3],
    [ 1/4, -1/4,  1/4, -1/4, -1/4, -1/4,  1/4, -1/4,  1/4],
    [ 1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5, -1/5,  1/5],
    [ 0/6,  3/6,  0/6,  3/6, -6/6,  3/6,  0/6,  3/6,  0/6],
])

Simpler

  • Trying to make this easier to see and think about.
multi = np.array([
    [ 0,  0,  0,  0,  1,  0,  0,  0,  0],
    ...
    [ 0,  0, .3,  0, .3,  0, .3,  0,  0],
    ...
    [.2,  0, .2,  0, .2,  0, .2,  0, .2],
    ...
])

Closing

Think, Pair, Share

  • Talk about
    • Vectors, and encoding.
    • Graphs
    • Edges
    • Weights
    • Matrices
  • And also: classification.