A parallel program is one that use multiple pieces of computational hardware to perform a computation more quickly. The aim is to finish the computation quicker by delegating parts of the computation to pieces of hardware to execute at the same time. Parallel programming is primarily focused on efficiency. This page goes into a ton of details (way more than I discussed in class). I suggest you at least glance through the overview at the beginning and then the examples, for example the heat equation example is instructive.
Concurrency is a programming technique in which there are multiple code blocks of control within a single program. In our mind we mostly think of these blocks of code as executing "at the same time", meaning that the effects of the blocks are interleaved. If the blocks actually execute at the same time is an implementation detail, the other option is that the blocks of code alternate every 1/1000 of a second, giving the appearance of "at the same time". Concurrent programming is primarily focused on structuring a program that has to deal with multiple independent external agents: for example a database, the user, and network sites.
I want to stress the difference between parallel programming and concurrent programming. Parallel programming is interested in deterministic evaluation of a problem as quick as possible. Deterministic here means that the same result should be computed every time the program runs. Concurrency is a programming technique to manage non-determinism. With concurrency, our program will execute differently and have different results every time it is run (because it depends on external actors or agents) and concurrent programming is a technique to manage this non-determinism.
One simple method to implement parallelism is to just use multiple physical computers communicating with each other over the network. Each computer will run a program designed using the features we have discussed so far this semester. See Distributed computing.
Two events are concurrent if we cannot tell by looking at the program which will happen first. Note that it is tempting to think of concurrent events as happening at the same time but this is incorrect.
Because of OS scheduling, the statements within two different programs running on the computer are concurrent.
We can introduce concurrency within the same program. Normally inside a program there is no concurrency, statements are executed one after another. But sometimes we have statements that could be executed concurrently (that is, we do not care about the order in which the statements run).
For example, GIMP has some statements that are controlling the user interface and we also have statements which implement the image operations like red-eye removal, resize the image, etc. Another example is a web server, where the statements processing a single request must happen in order but multiple requests can happen concurrently (i.e. we do not care how the request processing is interleaved).
By telling python (which tells the kernel) about some concurrency, we can obtain the appearance that they happen at the same time due to operating system scheduling. This creates the appearance to the user that the UI and the image manipulation are happening at the same time, since the image manipulation will run for a short time, be interrupted, the UI code will run for a short time, be interrupted, the image manipulation will run for a short time, and so on.
We can implement distributed computing without any major new features, just write programs communicating with each other over the network. But for other types of parallel computing and for concurrency, we need some assistance from the kernel and/or CPU.
Up until a few years ago, in popular languages like C, C++, Java, C#, ObjectiveC, Perl, Python, and others the main way to tell the language about concurrency is a technique which is called threaded programming. Threaded programming is tremendously difficult and is easily the number one source of bugs (the most infamous bug is Therac 25 which killed six people).
Over the years, safety critical code (like code running by NASA, medical devices, fighter jets, telecommunications for 911, etc.) has been written in programming languages like Ada, Erlang, and Haskell which have much better ways of specifying concurrency than threads. In the last four or five years, the popular programming languages (including python) are (finally) adding these language features which replace threaded programming, are easier to use, but still provide similar benefits. So as new programmers, I suggest you avoid threads if possible and learn these new techniques. There are too many to mention here, but in MCS 275 we spend a few weeks on this. If you are interested in threads, this guide is very good (and also highlights the complexity of threads).
As transistors have gotten smaller, chip designers like Intel and AMD (and others) have realized that they can pack more transistors into a given area but that there isn't a way to make a single fetch/decode/execute circuit any faster since more transistors in a fetch/decode/execute circuit means a slower execution because the electrical signals need to travel further. Therefore, they have just put multiple fetch/decode/execute circuits into the same package. Each fetch/decode/execute circuit is designed like we discussed and is called a core.
We can take advantage of multiple cores in the following ways:
Just like in distributed computing, we can just run multiple programs. For example, if we are doing weather simulation on a computer with 8 cores, we can run 8 copies of the program. These 8 programs can communicate with each other over the network to distribute the work. The benefit here is that the same techniques as writing a single program can continue to be used.
If a program has told the kernel about some concurrency within itself (like my example of a photo manipulation program), the kernel can at its discretion run the multiple blocks at the same time. But this multiple blocks executing at the same time on cores should be viewed as a side benefit, concurrency is equally useful on a computer with only a single core. In fact, the implementation in python does not run concurrent blocks at the same time. In python, only a single block is ever running at any point in time, it is just that blocks might switch back and forth every 1/1000 of a second. Therefore, a single python program cannot take advantage of multiple cores (you must use more than one program to take advantage of multiple cores). Other languages besides python relax this restriction.
Read the first chapter of The Little Book of Semaphores - 6 pages available for free online. You don't have to turn anything in and we won't check that you actually read it except the content might appear on the final exam.