It’s easy to think about code execution on a single timeline. Code executes in the order it appears in the source. It’s harder to think about code which executes concurrently. What are the benefits? What are the trade-offs? What extra steps need consideration? When should you write concurrent code instead of single threaded code? In this article, I will attempt to answer these questions, and I will attempt to explain my answers as if I were talking to a five year old.

Basic Concept

In order to explain concurrency, let’s imagine that there is some large job to be done: counting toys in a house. Let’s also imagine that there’s a person in charge of doing this job. Let’s call her Tina. Tina is really good at counting toys, so she can count all the toys in a house in ten minutes. If she starts counting a house at 1:00, she’ll be done at 1:10.

Now let’s imagine that there are two houses full of toys. If Tina starts at 1:00, she’ll take ten minutes to count each house and be done at 1:20. If there are three houses, she’ll be done at 1:30.

Tina can only count one house at a time, and it always takes her ten minutes. But what if she has a friend help her? Tina has a friend named Tamara. Tamara can also count all the toys in a house in ten minutes. So, if there are two houses and they each start counting toys in a different house at 1:00, they’ll both be done at 1:10. If Tina did it by herself, she’d be done at 1:20, but with Tamara, she can finish at 1:10. It’s faster when she works with a friend.

Tina and Tamara are analogous to threads. Some jobs are divisible into parts. (Each house is a “part” here. In a computer, it might be processing a chunk of data.) When jobs are divisible, sometimes it makes sense to split the work among multiple threads in order to speed up the completion time overall. (Note that not ALL jobs are divisible. Think of baking a cake. Some things have to be done before the others. If you bake the ingredients before you mix them, no one will like the resulting cake. Baking and mixing simultaneously will also be a bad time for all. Part of what makes a job divisible is that the parts are order-independent.)

Synchronization and Immutability

Let’s add in a supervisor who constantly asks for updates on the toy counting, Tabatha. Tabatha shows up once and a while at unexpected times and wants to know the count of toys.

Let’s also state that Tina and Tamara have a strong distrust of their own short term memories and so when they count toys, they always use a piece of paper and a pencil, immediately marking the papers when they count a new toy. They both use the same method, writing a two digit number on the paper, and then erasing it when it needs to change. Before one of them even starts counting, she writes in “00” since that’s how many toys she has counted so far. Every time she finds a new toy, she erases the number on the page (one digit at a time) and writes in a new number. When she counts the first toy, she erases “00” and writes “01”. When she counts the second toy, she erases “01” and writes “02”.

Tabatha is amenable to this method. She used to count houses herself before becoming a supervisor, and used the same method back then.

Let’s imagine that there are 22 toys in the house that Tina is counting. Tina started counting at 1:00. Tabatha shows up at 1:09, when Tina is mostly done, and wants to see how many toys Tina has counted so far. On the paper, there is the number “19”. But at the moment Tabatha showed up, Tina counted the 20th toy. Tabatha reads slowly and Tina writes quickly. So, Tabatha reads the first digit of “19”, which is “1” for the tens digit. But before she reads the “9” in “19”, Tina changes both digits so that the number says “20”. Then Tabatha reads the “0” in “20”. Tabatha has read a “1” (in “19”) and a “0” (in “20”). She puts the two digits together and thinks that Tina has counted only 10 toys. Tabatha is now upset because she knows there is something wrong with this, but doesn’t know what happened. We’ll call this the “19/20 problem”.

Here’s an even worse situation. Tina and Tamara are counting their two respective houses, but today they are sharing one piece of paper. The paper says “20” because together, Tina and Tamara have counted 20 toys so far. Tina then counts 1 new toy and Tamara counts 1 new toy at the same time. They both go to write to the paper at the same time, but they don’t wait for each other. Tamara gets there first, sees the “20”, adds 1 in her head, and writes “21”. But before she finishes writing the “1” in “21”, Tina sees the “20”, adds 1 in her head, and writes “21”. Since Tamara started first, she finishes writing first. When Tina finishes writing, the paper says “21”, but it should say “22”. The paper is wrong and since Tina and Tamara are not keeping totals in their heads, the overall total will be wrong. We’ll call this the “21/22 problem”.

Both the 19/20 and 21/22 problems are the sorts of thing that can happen when two threads access the same region of memory at the same time. There are basically two solutions to this.

Synchronization is where everyone agrees in advance that only one person can be using the totals paper at the same time. This resolves the data integrity problems, but can also result in deadlocks if implemented incorrectly.

What is a deadlock? Suppose that in order to avoid the 21/22 scenario, Tina and Tamara agree that only one person can hold the pencil at a time, and the same for the paper: only one person at a time. This way, no one can make the mistake of looking at a partially written number. Suppose Tina picks up the pencil and at the same time, Tamara picks up the paper. Neither of them will be able to finish her own respective task, and they will wait for each other indefinitely because of their agreement. In this example, the deadlock could have been avoided by treating the paper and pencil as a single unit.

There is another problem with synchronization: it causes contention. What if Tabatha wants to know how many toys have been counted before Tina and Tamara are finished? Using synchronization (properly), only one person is allowed to read or write to a paper at a time. So, if Tabatha asks Tina and Tamara for their papers, they must both wait for Tabatha to finish doing her totals before they can start counting again. This wastes time.

Another solution is immutability. Instead of handing Tabatha their papers, Tina and Tamara each stop counting for a moment, make a(n ink) copy of their counting paper, and give it to Tabatha so she can do her totals. Making copies instead of using the originals allows Tina and Tamara to resume counting faster since they don’t have to wait for Tabatha to do her totals. Making the copies in ink instead of pencil is called immutability, and doing this ensures that the 21/22 and 19/20 -type errors don’t happen since no one is overwriting anything on the same paper. It also avoids deadlock, since again, no one is writing on the same paper.

In programmer-speak, any mutability should be isolated to a single process.

Trade-Offs

What if when Tina calls for Tamara, Tamara takes 60 minutes to arrive? In that case, it doesn’t make any sense to even call Tamara unless there are 8 or more houses, since Tina will have counted the first 6 by herself by the time Tamara gets there.

Likewise, what if making an ink copy uses a whole pen and Tina has to run to the store for a new pen each time Tabatha asks for an ink copy? Tabatha would be better off just waiting until the end to receive the subtotals unless there are very many houses.

There is overhead to setting up and maintaining threads. It is not always better to use many threads because the overhead may be heavier than the task itself.

Summary

Threads are like little people in a computer who can only do one sequential thing at a time. “Thread safety” means handling race conditions like 19/20 and 21/22 via synchronization or immutability. Neither solution is inherently better (as discussed in the trade-offs section). Determining which to use depends on the situation.

Different tasks are better off single threaded or multithreaded, depending on their characteristics. Tasks which are divisible, heavy enough to outweigh thread overhead, and order-independent are the best candidates for multithreading.

Concurrency can be a little bit tricky, but when done right, the rewards are substantial.