Time

Calvin (Deutschbein)

W14Mon: 24 Nov

Announcements

  • Adventure Ongoing
    • You should be making API calls
  • Advising ongoing
    • If you encounter any problem, email me immediately
    • I'll be doing triage:
      • If you don't get an email back quickly, either
        • I'm on a multi-hour trail run and/or asleep, or
        • I will be able to solve any problems you encounter non-urgently.
      • Either way, once you send the email it is not your problem.

Today

  • Computer science as an experimental science.
  • Is write-good-code relevant.

Refresh Binary Search

  • Altogether def check_word(my_word, word_list): while len(word_list) > 1: half_length = len(word_list) // 2 if my_word < word_list[half_length]: # keep only the first half word_list = word_list[:half_length] else: # keep only the second half word_list = word_list[half_length:] return my_word == word_list[0]

Refresh Binary Search

  • This is algorithmically efficient! def check_word(my_word, word_list): while len(word_list) > 1: half_length = len(word_list) // 2 if my_word < word_list[half_length]: word_list = word_list[:half_length] else: word_list = word_list[half_length:] return my_word == word_list[0]
  • It is a special new type of inefficient: technically inefficient.

Refresh Binary Search

  • Examine these lines: ... word_list = word_list[:half_length] ... word_list = word_list[half_length:]
  • Let's remind ourselves of how strings work within Python.

The word list is big!

  • We can request it. import requests, json URL = "https://gist.githubusercontent.com/cd-public/0a09043d500a9bc3397ebcfeb5f7a4f5/raw/41d37a509cc19e0609a1637370096f30ff4a1ea3/english.json" r = requests.get(URL) ENGLISH_WORDS = r.json()

The word list is big!

  • How big? import sys sys.getsizeof(ENGLISH_WORDS)
  • 1140568 - one million letters or one million bytes (a megabyte).

The word list is big!

  • Then how much memory do we need to do this... word_list = word_list[:half_length]
  • At least the first time, half a megabyte.
  • It probably doesn't feel like a lot (it isn't) but your computer probably noticeably slows a bit when loading the list of words.

Don't copy!

  • But, you see, we don't need a new list.
  • We can just keep track of the closest point to the beginning and closest point to the end that the word can be at! head = 0 tail = len(word_list) - 1 while tail - head > 1: midl = head + (tail - head) // 2 if my_word < word_list[midl]: # keep only the first half tail = midl else: # keep only the second half head = midl
  • Rather than making copies of the big list of words, we just refine which part of the big list we are looking at.

Let's watch!

  • But, you see, we don't need a new list.
  • We can just keep track of the closest point to the beginning and closest point to the end that the word can be at! def check_word(my_word): head, tail = 0, len(ENGLISH_WORDS) - 1 while tail - head > 1: midl = head + (tail - head) // 2 if my_word < ENGLISH_WORDS[midl]: tail = midl else: head = midl return my_word == ENGLISH_WORDS[head]
  • Rather than making copies of the big list of words, we just hone in a smaller and smaller part of the list.

Let's check it

  • Let's check:
    1. Using Python "in"
    2. Writing a "for" loop checking every word.
    3. The end of last class version, with copies.
    4. This version, using an index.
  • We'll introduce another new library - time!

Time

Time

  • We can measure how long it takes from someone to input something. from time import time start = time() print("After you press enter, the amount of time since this text appeared will be printed.") input() end = time() print(end-start)

Timing

  • We can:
    • Import time.
    • Create a list of words, say every hundredth word.
    • Create a list of the four functions.
    • Loop over each function:
      • Start the clock.
      • Check each word and it's reversal.
      • Stop the clock and then print the result.

Sample Code

    from time import time test_set = ENGLISH_WORDS[50::100] funcs = [by_in, by_for, by_copy, by_index] for f in funcs: start = time() for word in test_set: f(word), f(word[::-1]) print(f, time()-start)

Link to .py

Results

  • These are the results I got... <function by_in at 0x7aa67a763d90> 1.5537760257720947 <function by_for at 0x7aa679869c60> 4.493199586868286 <function by_copy at 0x7aa679915630> 1.0388541221618652 <function by_index at 0x7aa6799156c0> 0.0052835941314697266
  • Takeaway: If you know a list is sorted, you can be (much) faster than the built-in way of doing things.
  • You can do things both (1) algorithmically correctly and (2) technically efficiently.
  • Computer science is both a mathematical and natural science.

Today

  • Computer science as an experimental science.
  • Is write-good-code relevant.

Announcements

  • Adventure Ongoing
    • You should be thinking about how to navigate between scenes
  • Advising ongoing
    • If you encounter any problem, email me immediately
    • I'll be doing triage:
      • If you don't get an email back quickly, either
        • I'm on a multi-hour trail run and/or asleep, or
        • I will be able to solve any problems you encounter non-urgently.
      • Either way, once you send the email it is not your problem.