The actual speed/execution time of an algorithm will depend on how it exactly is converted to code and details about the hardware that executes the code. Despite that, we can still obtain a rough estimate for the speed. Our goal here is to compare two different algorithms and if the rough estimate is enough to show that one is much faster than another we are happy.
Consider this pseudocode, which finds the smallest element in a list
1 2 3 4 5 6 |
|
To estimate the time, we decide for each line of code how long it takes to execute and how many times it is executed. For "base" instructions we just say it takes a constant amount of time. For the discussion let n = len(L)
. First, line one takes time c1
for some constant which depends on the implimentation and execution and I can't say anything more specific. Line one is also is executed once so the total time "spent" on line one is just c1
. Line two consists of an increment of i
and a test that i
is at most len(L)
but each of those take some amount of time, say c2
. Also, line two is executed n
times for a total time "spent" on line two of c2 n
. Similarly, lines 3 and 4 take constant time c3
and c4
and are executed n
times. Therefore the total time is
c1 + (c2 + c3 + c4) n
By definition of big-O, this is O(n)
so we say that finding the minimum element in a list runs in time O(n)
and we mean there is some constant c
(which depends on the details how how the algorithm is implemented and executed) so that the time is at most c n
.
n
is the length of the list. The analysis is similar, each line is some constant but the inside of the inner loop over j
is run n(n+1)/2
times.Here is the selection sort from last time in python
def selection_sort(L):
for i in range(0, len(L)):
# Find the smallest among i...len(L)
min_idx = i
for j in range(i+1,len(L)):
if L[j] < L[min_idx]:
min_idx = j
# Move smallest into position i
temp = L[min_idx]
L[min_idx] = L[i]
L[i] = temp
import time
import random
L = [random.randint(0,10000) for i in range(10)]
start = time.time()
selection_sort(L)
end = time.time()
print("%f seconds" % (end - start))
Do the same timing with the same sizes but with python's built in sorting algorithm L.sort()
. Python uses an algorithm called Timsort. How big do you need to make the list before python takes more than a second to sort? For me it seems to be around 2 million.
Selection sort is O(n2) and Timsort is O(n log n)
. Explain using big-O notation why Timsort is much better than selection sort on large lists, no matter how the algorithms are actually implemented in code.
For real timing of python code, one should use the timeit module. (See here for some examples.) Convert the timing code to timeit and also explain what additional results the timeit module is giving you over just using time.time
.
If you are bored, watch some of the videos from this page or youtube or their website. You can find a video of a selection sort dance there, plus compare some of the other sorting algorithms.