An Introduction to Threads

Although Python provides no true parallelism, concurrency is supported in an object oriented fashion.

Concurrency and Parallelism

We start with some terminology.

  • concurrency

    Concurrent programs execute multiple tasks independently.

    For example, a drawing application, with tasks:

    • receiving user input from the mouse pointer,
    • updating the displayed image.
  • parallelism

    A parallel program executes two or more tasks in parallel with the explicit goal of increasing the overall performance.

    For example: a parallel Monte Carlo simulation for \(\pi\), written with the multiprocessing module of Python.

Every parallel program is concurrent, but not every concurrent program executes in parallel.

Another dichotomy is between processes and threads.

  • At any given time, many processes are running simultaneously on a computer. The operating system employs time sharing to allocate a percentage of the CPU time to each process.

    Consider for example the downloading of an audio file. Instead of having to wait till the download is complete, we would like to listen sooner.

  • Processes have their own memory space, whereas threads share memory and other data. Threads are often called lightweight processes. A thread is short for a thread of execution, it typically consists of one function.

    A program with more than one thread is multithreaded.

The state diagram illustrating the life cycle of a thread is shown in Fig. 90.

_images/figlifethread.png

Fig. 90 The life cycle of a thread. A thread is created (born) and then scheduled to run (ready). In the running state, if can be waiting for input, be put to sleep by the OS, or blocked from execution, before it terminates (dead).

Multithreading in Python

Python provides multithreading in the _thread module. Our first multithreading Python code will

  1. Import the _thread module.
  2. Start three threads using _thread.start_new_thread
    • each thread will say hello and sleep for n seconds,
    • after starting the threads we must wait long enough for all threads to finish.

The code for our first multithread Python script hello_threads.py is listed below.

import _thread
from time import sleep

def say_hello(name, nsec):
    """
    Says hello and sleeps nsec seconds.
    """
    print("hello from " + name)
    sleep(nsec)
    print(name + " slept %d seconds" % nsec)

print("starting three threads")
_thread.start_new_thread(say_hello, ("1st thread", 3))
_thread.start_new_thread(say_hello, ("2nd thread", 2))
_thread.start_new_thread(say_hello, ("3rd thread", 1))
sleep(4)  # we must wait for all to finish!
print("done running the threads")

A session running hello_threads at the command prompt can go as follows.

$ python hello_threads.py
starting three threads
hello from 1st thread
hello from 2nd thread
hello from 3rd thread
3rd thread slept 1 seconds
2nd thread slept 2 seconds
1st thread slept 3 seconds
done running the threads
$

As the underscore before the name _thread indicates, working directly with this module is not encouraged nor recommneded. To write concurrent programs in a proper object oriented manner, consider the Thread class, exported by the threading module:

  • We create new threads by inheriting from threading.Thread, overriding the methods __init__ and run.
  • After creating a thread object, a new thread is born.
  • With run, we start the thread.

Main difference with the _thread module is the explicit difference between the born and running state. Consider running hello_threading, as shown below.

$ python hello_threading.py
first thread is born
second thread is born
third thread is born
starting threads
hello from first thread
hello from second thread
hello from third thread
threads started
third thread slept 1 seconds
second thread slept 4 seconds
first thread slept 5 seconds
$

The script hello_threading.py defines the class HelloThread as outlined below.

import threading

class HelloThread(threading.Thread):
    """
    hello world with threads
    """
    def __init__(self, t):
        """
        initializes thread with name t"
        """
    def run(self):
        """
        says hello and sleeps awhile"
        """
def main():
    """
    Starts three threads.
    """

The constructor and the run method are defined below.

def __init__(self, t):
    """
    initializes thread with name t"
    """
    threading.Thread.__init__(self, name=t)
    print(t + " is born ")

def run(self):
    """
    says hello and sleeps awhile"
    """
    name = self.getName()
    print("hello from " + name)
    nbr = randint(1, 6)
    sleep(nbr)
    print(name + " slept %d seconds" % nbr)

and the main function is

def main():
    """
    Starts three threads.
    """
    first = HelloThread("first thread")
    second = HelloThread("second thread")
    third = HelloThread("third thread")
    print("starting threads")
    first.start()
    second.start()
    third.start()
    print("threads started")

if __name__ == "__main__":
    main()

Producer/Consumer Relation

As an illustration to run two different algorithms concurrently, consider the producer/consumer relation. A very common relation between two threads is that of producer and consumer. For example, the downloading of an audio file is production, while listening is consumption.

Our producer/consumer relation with threads uses

  • an object of the class Producer is a thread that will append to a queue consecutive integers in a given range and at a given pace;
  • an object of the class Consumer is a thread that will pop integers from the queue and print them, at a given pace.

If the pace of the produces is slower than the pace of the consumer, then the consumer will wait. An illustration of the running of the code is below.

$ python prodcons.py
producer starts...
producer sleeps 1 seconds
consumption starts...
consumer sleeps 1 seconds
appending 1 to queue
producer sleeps 4 seconds
popped 1 from queue
consumer sleeps 1 seconds
wait a second...
wait a second...
wait a second...
appending 2 to queue
producer sleeps 2 seconds
popped 2 from queue
consumer sleeps 1 seconds
wait a second...
appending 3 to queue
production terminated
popped 3 from queue
consumption terminated

The UML diagrams of producer and consumer as drawn in Fig. 91.

_images/figumlprodcons.png

Fig. 91 UML diagrams for the producer and consumer classes. The queue in both classes refers to the same list.

The queue in each class refers to the same list:

  • The producer appends to the queue.
  • The consumer pops from the queue.

The class Producer is structured as follows.

import threading

class Producer(threading.Thread):
    """
    Appends integers to a queue.
    """
    def __init__(self, t, q, a, b, p):
        """
        Thread t to add integers in [a, b] to q,
        sleeping between 1 and p seconds.
        """
    def run(self):
        """
        Produces integers at some pace.
        """

The constructor method is defined below.

def __init__(self, t, q, a, b, p):
    """
    Thread t to add integers in [a, b] to q,
    sleeping between 1 and p seconds.
    """
    threading.Thread.__init__(self, name=t)
    self.queue = q
    self.begin = a
    self.end = b
    self.pace = p

The production method is defined by the run method.

def run(self):
    """
    Produces integers at some pace.
    """
    print(self.getName() + " starts...")
    for i in range(self.begin, self.end+1):
        nbr = randint(1, self.pace)
        print(self.getName() + \
            " sleeps %d seconds" % nbr)
        sleep(nbr)
        print("appending %d to queue" % i)
        self.queue.append(i)
    print("production terminated")

The constructor and run method of the class Consumer are documented below.

import threading

class Consumer(threading.Thread):
    """
    Pops integers from a queue.
    """
    def __init__(self, t, q, n, p):
        """
        Thread t to pop n integers from q.
        """
    def run(self):
        """
        Pops integers at some pace.
        """

The constructor of the class Consumer is defined below.

def __init__(self, t, q, n, p):
    """
    Thread t to pop n integers from q.
    """
    threading.Thread.__init__(self, name=t)
    self.queue = q
    self.amount = n
    self.pace = p

Consuming elements is defined by the method run.

def run(self):
    """
    Pops integers at some pace.
    """
    print("consumption starts...")
    for i in range(0, self.amount):
        nbr = randint(1, self.pace)
        print(self.getName() + \
            " sleeps %d seconds" % nbr)
        sleep(nbr)
        while True:
            try:
                i = self.queue.pop(0)
                print("popped %d from queue" % i)
                break
            except IndexError:
                print("wait a second...")
                sleep(1)
    print("consumption terminated")

The code for the class Producer and Consumer in modules classproducer and classconsumer respectively. The main program is in the file prodcons.py, listed below.

from classproducer import Producer
from classconsumer import Consumer

QUE = []     # queue is shared list
PROD = Producer("producer", QUE, 1, 3, 4)
CONS = Consumer("consumer", QUE, 3, 1)
PROD.start()  # start threads
CONS.start()
PROD.join()   # wait for thread to finish
CONS.join()

Exercises

  1. Implement the secret guessing with client/server network programming of section 6.1.3 using threads.
  2. Modify the producer/consumer relationship into card dealing. The producer is the card dealer, the consumer stores the received cards in a hand.
  3. When running a large simulation, e.g.: testing the distribution of a random number generator, it is useful to consider the evolution of the histogram. Design a multithreaded program where the producer generates random numbers that are then classified by the consumer.