Euclid

Week 0x7

Prof. Calvin

Crypto

Announcements

  • Welcome to variously CS 276/CS 540
    • We introduce an algorithm that makes key generation viable.
  • Action Items:
    • RSAinC due this week
    • BigRSA due 31 Mar
    • Midterm next week

Today

  • Euclidean division
  • Euclidean algorithm
  • Extended Euclidean algorithm
  • Bézout’s identity

Review

3. \(\lambda(n)\)

  • This one is kept secret by the way.
  • It so happens: \[ \forall p,q \in \mathbb{P} : \lambda(pq) = \text{lcm}(p-1, q-1) \]
  • There’s fast ways to this, but anything goes this week.

LCM

  • Least common multiple
def gcd(a, b):
    """Calculate the Greatest Common Divisor of a and b."""
    while b:
        a, b = b, a % b
    return a

def lcm(a, b):
    """Calculate the Least Common Multiple of a and b."""
    return a * b // gcd(a, b)

MMI

  • Multiplicative Modular Inverse
def find_d():
  d = 1
  while 1 != (d * e % hide_λ()):
    d += 1
  return d

ModExp

  • Exponentation modulo \(n\).
def modexp(m, e, n):
  if e == 0:
      return 1
  if e == 1:
      return m % n
  if e % 2:
      return (m * modexp(m*m % n, e//2, n)) % n
  return  modexp(m*m % n, e//2, n) % n

Context

  • Each of these operations are dealing with, fundamentally, the same problem.
  • We have big numbers we want to work with but,
  • None are bigger than \(n\), and
  • The bigger the number, the harder the work.

Euclidean Division

  • Division as implemented by “bigdiv”:
uint64_t bigdiv(uint64_t *num, uint64_t *den, uint64_t *quo, uint64_t *rem) {
    return 0;
}

uint64_t bigquo(uint64_t *num, uint64_t *den, uint64_t *quo) {
    uint64_t rem[S];
    bigdiv(num,den,quo,rem);
    return 0;
}

uint64_t bigrem(uint64_t *num, uint64_t *den, uint64_t *rem) {
    uint64_t quo[S];
    bigdiv(num,den,quo,rem);
    return 0;
}

Naive Division

  • Entry-level categorization, term it “naive”
    • often over \(\mathbb{Q}\) or \(\mathbb{R}\)
    • quotient:float = division(dividided:int|float,divisor:int|float)
    • In Python \(\mathbb{Z} \times \mathbb{Z} \rightarrow \mathbb{Q}\)
    • Python / for all values, C / for float types.

Integer Division

  • Integer division - often over \(\mathbb{N}\) or \(\mathbb{Z}\)
    • quotient:int = division(dividided:int,divisor:int)
    • Python // for int types, C / for int types.

We note

  • Many sets of numbers are not closed under division
    • \(\exists x,y \in \mathbb{N} : x/y \not\in \mathbb{N}\)
  • These sets are closed under integer division but then lack multiplicative inverses.
    • \(\exists n \in \mathbb{N} : \nexists m \in \mathbb{N} : n \times m \in \mathbb{N}\)
  • These are all bad!

Euclidean Division

  • Euclidean division solves this problem by changing the type of the function
    • NOT by changing the types of the operands (inputs, divisor, divisend, numerator, denominator)
    • YES by changing the number of return values
  • We will simply have two return values.
    • We recall this impossible in C (loud cheers)

Euclidean Division

  • Also called “division with remainder”.
    • Generally regarded as being applied to the integers, and works well for unsigned ints.
    • Often termed div, /, \, and %, I used div for “bigdiv” since the others were taken.
  • Named as a result of “Euclidean Algorithm”, which is used in cryptography!

Euclidean Division

  • In Python:
def div(x:int, y:int) -> tuple[int, int]:
    # Euclidean Division
    return x // y, x % y
  • In Latex: \(\mathbb{Z} \times \mathbb{Z} \rightarrow \mathbb{Z} \times \mathbb{N}\)

Terminology

  • Dividend / Divisor = Quotient
  • Dividend % Divisor = Remainder
  • Dividend ~= Numerator
  • Divisor ~= Denominator
  • Quotient ~= Fraction ~= Ratio
  • Remainder is modulo if dividend and divisor are naturals.

Equality

  • Divisor \(\times\) Quotient + Remainder = Dividend

Constraints

  • The remainder is restricted to the naturals, even for division over the integers.
  • The remainder is restricted to be smaller than divisor - the thing by which we divide.

Constraints

  • The quotient is the number of largest number absolute by which the divisor may be multipled such the product is absolutely less than the divisend.
  • The remainder is the smallest positive value such that the sum of the remainder and the product of the quotient and divisor is equal to the divisend.

Takeaways

  • Remainders are small and positive (good).
  • We get to keep using integers, and through some effort can stick to naturals.
  • This is good if we want to work with unsigned integers.

Today

  • ✓ Euclidean division
  • Euclidean algorithm
  • Extended Euclidean algorithm
  • Bézout’s identity

Goals

  • Given two values, the Euclidean algorithm finds the greatest common denominator.
  • Generally, it proceeds either iteratively or recursively.
    • It is an algorithm, not an implementation.
  • Historically used to synthesize integer division from integer subtraction, now primary use in cryptography.

Example

  • Suppose we have
a, b = 1071, 462
  • We wish to find their greatest common denominator

Initialize

  1. Take the larger value
x, y = max(a,b), min(a,b)

Substract

  1. Subtract the smaller value from the larger value until it is no longer larger.
while x >= y:
    x -= y

Loop

  1. Repeat…
x, y = max(x,y), min(x,y)

Terminate

  1. Stop when either x or y is zero
x, y = x != 0 and y != 0 ? max(x,y), min(x,y) : exit()

Iteratively

# i am not about to try to spell "uelcidan lagorhtim"
def ea(a:int, b:int) -> int: 
    while min(a,b) != 0:
        a, b = max(a,b), min(a,b)
        while a >= b:
            a -= b
    return max(a,b)

Modulo

  • Essential we calculate a modulo or remainder (depending)
  • Then… do that again.
# i am not about to try to spell "udelcian altoghtn"
def ea(a:int, b:int) -> int: 
    while min(a,b) != 0:
        a, b = max(a,b), min(a,b)
        a %= b
    return max(a,b)
  • Recall - this invented division and modulo, so it can be done without assuming division and modulo.
    • We are using it to do gcd, and exist in a world with division and modulo.

Recursively

  • Easier for me to understand recursively, either with mod or without.
    • Here’s one mod implementation.
# Have to make a lot of design decisions here, this is one variant
recmod = lambda a, b: recmod (a - b, b) if a - b >= 0 else a

Recursively

  • Easier for me to understand recursively, either with mod or without.
    • Here’s recursive EA
# Much easier if we assume argument order, of course
rec_ea = lambda a, b : rec_ea(max(a,b) % min(a,b), min(a,b)) if min(a,b) > 0 else max(a,b)

# I much prefer
helper = lambda a, b : helper(b, a % b) if b else a
rec_ea = lambda a, b : helper(max(a,b), min(a,b))

With def

  • We do not all love inlining, and that’s okay!
def ea(a:int, b:int) -> int:
    a, b = max(a,b), min(a,b)
    def inner(a, b):
        if b:
            return inner(b, a % b)
        else: 
            return a
    return inner(a,b)

Simplest

  • We can remove type hints and just trust bigger first.
def ea(a, b):
    if b:
        return ea(b, a % b)
    return a

Example

  • ea(1071, 462)
  • I will use the following:
def ea(a, b):
    print(f"a = {a:4d}, b = {b:4d}")
    if b:
        return ea(b, a % b)
    return a
  • We get:

Test it

  • Both are divisible by 21!
>>> 1071/21
51.0
>>> 462/21
22.0

Anyways

  • This is what GitHub Copilot did, I don’t know if we understood it at the time.
def gcd(a, b):
    """Calculate the Greatest Common Divisor of a and b."""
    while b:
        a, b = b, a % b
    return a

def lcm(a, b):
    """Calculate the Least Common Multiple of a and b."""
    return a * b // gcd(a, b)

Subtraction-based animation of the Euclidean algorithm. The initial rectangle has dimensions a = 1071 and b = 462. Squares of size 462×462 are placed within it leaving a 462×147 rectangle. This rectangle is tiled with 147×147 squares until a 21×147 rectangle is left, which in turn is tiled with 21×21 squares, leaving no uncovered area. The smallest square size, 21, is the GCD of 1071 and 462.

Today

  • ✓ Euclidean division
  • ✓ Euclidean algorithm
  • Extended Euclidean algorithm
  • Bézout’s identity

Reorder

  • ✓ Euclidean division
  • ✓ Euclidean algorithm
  • Bézout’s identity
  • Extended Euclidean algorithm

MMI

  • Multiplicative Modular Inverse
def find_d():
  d = 1
  while 1 != (d * e % hide_λ()):
    d += 1
  return d
  • This is bad!
    • This could be a loop over a 4096 bit integer!

Slow

  • It could loop this many times.
>>> 1 << 4096
1044388881413152506691752710716624382579964249047383780384233483283953907971557456848826811934997558340890106714439262837987573438185793607263236087851365277945956976543709998340361590134383718314428070011855946226376318839397712745672334684344586617496807908705803704071284048740118609114467977783598029006686938976881787785946905630190260940599579453432823469303026696443059025015972399867714215541693835559885291486318237914434496734087811872639496475100189041349008417061675093668333850551032972088269550769983616369411933015213796825837188091833656751221318492846368125550225998300412344784862595674492194617023806505913245610825731835380087608622102834270197698202313169017678006675195485079921636419370285375124784014907159135459982790513399611551794271106831134090584272884279791554849782954323534517065223269061394905987693002122963395687782878948440616007412945674919823050571642377154816321380631045902916136926708342856440730447899971901781465763473223850267253059899795996090799469201774624817718449867455659250178329070473119433165550807568221846571746373296884912819520317457002440926616910874148385078411929804522981857338977648103126085903001302413467189726673216491511131602920781738033436090243804708340403154190336

We note

  • In the Euclidean algorithm, we were tracking only remainders.
    • What if we also tracked quotients?
  • We introduce normal expression “Bézout’s identity”
    • Named after a 1700s French scholar
    • Of note: subordinate result to 300s Chinese scholarship

Bézout’s identity

\[ \forall a, b \in \mathbb{Z} : \exists x, y \in \mathbb{Z} : a \times x + b \times y = \gcd(a,b) \]

  • The linear combinations of \(a\) and \(b\) are exactly the multiples of \(\gcd(a,b)\)

Recall

  • To get \((m^e)^d \equiv m \pmod{n}\)
  • We needed some \(d\) that was the:
    • Multiplicative inverse of \(e\)
    • Mod the totient, the lcm of one less than each prime.

Goals

  • So we need that \(e\), times something, equals 1.
    • That’s a linear combination!
    • \(e\) and \(\lambda(n)\) were coprime by construction.
      • So their gcd is 1

\[ \forall e, \lambda(n) \in \mathbb{Z} : \exists x, y \in \mathbb{Z} : e \times x + \lambda(n) \times y = 1 \]

  • And we are taking this \(\pmod{\lambda(n)}\) so

Put it all together

\[ \forall e, \lambda(n) \in \mathbb{Z} : \exists x \in \mathbb{Z} : e \times x \equiv 1 \pmod{\lambda(n)} \]

  • And this can be calculated with a little thing I like to call the “Extended Euclidean Algorithm”.
    • Perhaps prefer “extended GCD algorithm” which I think the terminology will converge on, in time.
    • But if you need to look it up, mention Euclid.

Reorder

  • ✓ Euclidean division
  • ✓ Euclidean algorithm
  • ✓ Bézout’s identity
  • Extended Euclidean algorithm

MMI

  • Multiplicative Modular Inverse
def find_d():
  d = 1
  while 1 != (d * e % hide_λ()):
    d += 1
  return d
  • This is bad!
    • This could be a loop over a 4096 bit integer!

EEA

  • The EEA is akin to the EA but tracks quotients as well as remainders.
    • This dramatically increases overhead.

Ground truth

  • I regard the Wikipedia simplifed pseudocode as ground thrust for eea.
    • They call it extended_gcd, which I do like.

\(x\) and \(y\)

  • We don’t need this:
  • In Bézout’s identity: \[ \forall a, b \in \mathbb{Z} : \exists x, y \in \mathbb{Z} : a \times x + b \times y = \gcd(a,b) \]
  • We need only \(x\), not \(y\).

Modulo

  • This implementation does not calculate modulo
  • That is equal to

Recursion

  • The usage of “old s” and “old r” a holdovers from a conversion from recursion to iteration.
    • Recall the gcd is calculable by any implementation
    • As is the algorithm

We can recurse

To Python

def extended_gcd(a, b):
    if b == 0:
        return a, 1, 0
    r, s, t = extended_gcd(b, a % b)
    return r, t, s - (a // b) * t # no graceful mod here

Example

>>> extended_gcd(1071, 462)
(21, -3, 7)
>>> 1071 * -3 + 462 * 7
21

Wait a minute!

  • Negatives are a problem
  • We deal with this modwise
  • Just add the modular base.

In Practice

  • The modular base is the totient.
  • That is also the b value we use in the eea.
def find_d():
    start = extended_gcd(e,hide_λ())[1]
    while start < 0:
        start += hide_λ()
    return start

Reorder

  • ✓ Euclidean division
  • ✓ Euclidean algorithm
  • ✓ Bézout’s identity
  • ✓ Extended Euclidean algorithm

Closing

  • You have have noticed bigmul & co cannot be used as infix operators.
r = m % n # this is a little mul
bigmul(m, n, r) # this is the format of a big mul - not possible in Py
  • You will need to:
    • Track the sign of values
    • Use big operations.
  • Start thinking about it now!

All

def find_large_prime(k):
    is_prime = lambda n : any([n % i for i in range(2, int(n **.5))])
    candidate = 6 * k + 1
    # you need a prime tester
    while not is_prime(candidate):
        candidate += 6
    return candidate

def extended_gcd(a, b):
    if b == 0:
        return a, 1, 0
    r, s, t = extended_gcd(b, a % b)
    return r, t, s - (a // b) * t # no graceful mod here


def lcm(a, b):
    return a * b // extended_gcd(a,b)[0]

s = "C" # a random string
m = ord(s) # to number

hide_p = lambda: find_large_prime(1 << 6)
hide_q = lambda: find_large_prime(1 << 7)
n = hide_p() * hide_q()

hide_λ = lambda: lcm(hide_p() - 1,  hide_q() - 1)

e = 65537 # encryptor

def find_d():
    start = extended_gcd(e,hide_λ())[1]
    while start < 0:
        start += hide_λ()
    return start

def modexp(m, e, n):
    if e == 0:
        return 1
    if e == 1:
        return m % n
    if e % 2:
        return (m * modexp(m*m % n, e//2, n)) % n
    return  modexp(m*m % n, e//2, n) % n

c = modexp(m, e, n) # ciphertext

new_m = modexp(c % n, find_d(), n)

print(chr(modexp(modexp(ord("C"), e, n), find_d(), n)))