An Introduction to Threads ========================== Although Python provides no true parallelism, concurrency is supported in an object oriented fashion. Concurrency and Parallelism --------------------------- We start with some terminology. * **concurrency** Concurrent programs execute multiple tasks independently. For example, a drawing application, with tasks: * receiving user input from the mouse pointer, * updating the displayed image. * **parallelism** A parallel program executes two or more tasks in parallel with the explicit goal of increasing the overall performance. For example: a parallel Monte Carlo simulation for :math:`\pi`, written with the multiprocessing module of Python. Every parallel program is concurrent, but not every concurrent program executes in parallel. Another dichotomy is between processes and threads. * At any given time, many processes are running simultaneously on a computer. The operating system employs *time sharing* to allocate a percentage of the CPU time to each process. Consider for example the downloading of an audio file. Instead of having to wait till the download is complete, we would like to listen sooner. * Processes have their own memory space, whereas threads share memory and other data. Threads are often called lightweight processes. A thread is short for *a thread of execution*, it typically consists of one function. A program with more than one thread is *multithreaded*. The state diagram illustrating the life cycle of a thread is shown in :numref:`figlifethread`. .. _figlifethread: .. figure:: ./figlifethread.png :align: center The life cycle of a thread. A thread is created (born) and then scheduled to run (ready). In the running state, if can be waiting for input, be put to sleep by the OS, or blocked from execution, before it terminates (dead). Multithreading in Python ------------------------ Python provides multithreading in the ``_thread`` module. Our first multithreading Python code will 1. Import the ``_thread`` module. 2. Start three threads using ``_thread.start_new_thread`` * each thread will say hello and sleep for ``n`` seconds, * after starting the threads we must wait long enough for all threads to finish. The code for our first multithread Python script ``hello_threads.py`` is listed below. :: import _thread from time import sleep def say_hello(name, nsec): """ Says hello and sleeps nsec seconds. """ print("hello from " + name) sleep(nsec) print(name + " slept %d seconds" % nsec) print("starting three threads") _thread.start_new_thread(say_hello, ("1st thread", 3)) _thread.start_new_thread(say_hello, ("2nd thread", 2)) _thread.start_new_thread(say_hello, ("3rd thread", 1)) sleep(4) # we must wait for all to finish! print("done running the threads") A session running ``hello_threads`` at the command prompt can go as follows. :: $ python hello_threads.py starting three threads hello from 1st thread hello from 2nd thread hello from 3rd thread 3rd thread slept 1 seconds 2nd thread slept 2 seconds 1st thread slept 3 seconds done running the threads $ As the underscore before the name ``_thread`` indicates, working directly with this module is not encouraged nor recommneded. To write concurrent programs in a proper object oriented manner, consider the ``Thread`` class, exported by the ``threading`` module: * We create new threads by inheriting from ``threading.Thread``, overriding the methods ``__init__`` and ``run``. * After creating a thread object, a new thread is born. * With ``run``, we start the thread. Main difference with the ``_thread`` module is the explicit difference between the born and running state. Consider running ``hello_threading``, as shown below. :: $ python hello_threading.py first thread is born second thread is born third thread is born starting threads hello from first thread hello from second thread hello from third thread threads started third thread slept 1 seconds second thread slept 4 seconds first thread slept 5 seconds $ The script ``hello_threading.py`` defines the class ``HelloThread`` as outlined below. :: import threading class HelloThread(threading.Thread): """ hello world with threads """ def __init__(self, t): """ initializes thread with name t" """ def run(self): """ says hello and sleeps awhile" """ def main(): """ Starts three threads. """ The constructor and the run method are defined below. :: def __init__(self, t): """ initializes thread with name t" """ threading.Thread.__init__(self, name=t) print(t + " is born ") def run(self): """ says hello and sleeps awhile" """ name = self.getName() print("hello from " + name) nbr = randint(1, 6) sleep(nbr) print(name + " slept %d seconds" % nbr) and the main function is :: def main(): """ Starts three threads. """ first = HelloThread("first thread") second = HelloThread("second thread") third = HelloThread("third thread") print("starting threads") first.start() second.start() third.start() print("threads started") if __name__ == "__main__": main() Producer/Consumer Relation -------------------------- As an illustration to run two different algorithms concurrently, consider the producer/consumer relation. A very common relation between two threads is that of producer and consumer. For example, the downloading of an audio file is production, while listening is consumption. Our producer/consumer relation with threads uses * an object of the class Producer is a thread that will append to a queue consecutive integers in a given range and at a given pace; * an object of the class Consumer is a thread that will pop integers from the queue and print them, at a given pace. If the pace of the produces is slower than the pace of the consumer, then the consumer will wait. An illustration of the running of the code is below. :: $ python prodcons.py producer starts... producer sleeps 1 seconds consumption starts... consumer sleeps 1 seconds appending 1 to queue producer sleeps 4 seconds popped 1 from queue consumer sleeps 1 seconds wait a second... wait a second... wait a second... appending 2 to queue producer sleeps 2 seconds popped 2 from queue consumer sleeps 1 seconds wait a second... appending 3 to queue production terminated popped 3 from queue consumption terminated The UML diagrams of producer and consumer as drawn in :numref:`figumlprodcons`. .. _figumlprodcons: .. figure:: ./figumlprodcons.png :align: center UML diagrams for the producer and consumer classes. The queue in both classes refers to the same list. The ``queue`` in each class refers to the same list: * The producer appends to the queue. * The consumer pops from the queue. The class ``Producer`` is structured as follows. :: import threading class Producer(threading.Thread): """ Appends integers to a queue. """ def __init__(self, t, q, a, b, p): """ Thread t to add integers in [a, b] to q, sleeping between 1 and p seconds. """ def run(self): """ Produces integers at some pace. """ The constructor method is defined below. :: def __init__(self, t, q, a, b, p): """ Thread t to add integers in [a, b] to q, sleeping between 1 and p seconds. """ threading.Thread.__init__(self, name=t) self.queue = q self.begin = a self.end = b self.pace = p The production method is defined by the ``run`` method. :: def run(self): """ Produces integers at some pace. """ print(self.getName() + " starts...") for i in range(self.begin, self.end+1): nbr = randint(1, self.pace) print(self.getName() + \ " sleeps %d seconds" % nbr) sleep(nbr) print("appending %d to queue" % i) self.queue.append(i) print("production terminated") The constructor and run method of the class ``Consumer`` are documented below. :: import threading class Consumer(threading.Thread): """ Pops integers from a queue. """ def __init__(self, t, q, n, p): """ Thread t to pop n integers from q. """ def run(self): """ Pops integers at some pace. """ The constructor of the class ``Consumer`` is defined below. :: def __init__(self, t, q, n, p): """ Thread t to pop n integers from q. """ threading.Thread.__init__(self, name=t) self.queue = q self.amount = n self.pace = p Consuming elements is defined by the method ``run``. :: def run(self): """ Pops integers at some pace. """ print("consumption starts...") for i in range(0, self.amount): nbr = randint(1, self.pace) print(self.getName() + \ " sleeps %d seconds" % nbr) sleep(nbr) while True: try: i = self.queue.pop(0) print("popped %d from queue" % i) break except IndexError: print("wait a second...") sleep(1) print("consumption terminated") The code for the class Producer and Consumer in modules ``classproducer`` and ``classconsumer`` respectively. The main program is in the file ``prodcons.py``, listed below. :: from classproducer import Producer from classconsumer import Consumer QUE = [] # queue is shared list PROD = Producer("producer", QUE, 1, 3, 4) CONS = Consumer("consumer", QUE, 3, 1) PROD.start() # start threads CONS.start() PROD.join() # wait for thread to finish CONS.join() Exercises --------- 1. Implement the secret guessing with client/server network programming of section 6.1.3 using threads. 2. Modify the producer/consumer relationship into card dealing. The producer is the card dealer, the consumer stores the received cards in a hand. 3. When running a large simulation, e.g.: testing the distribution of a random number generator, it is useful to consider the evolution of the histogram. Design a multithreaded program where the producer generates random numbers that are then classified by the consumer.