Multithreading and Synchronization
What is that?I assume that most of you know what a thread is. Basically it's a mechanism that allows code to be executed in a parallel fashion which helps speed up programs by making use of hardware intended for this purpose (such as multi-core CPUs).
How do I use it?It depends on the programming language and what kind of API you have at your disposal, but the basic idea is that you create a thread which will execute a portion of your code. There is something called the "main thread" which is basically the thread your code is running on initially. Usually code does not run on more than one thread without you specifying it explicitely, but some OSes try to make use of threads by default (like Android) in order to speed things up on their end. From that "main thread" you can then spawn new threads and execute code in a parallel fashion.
What are the problems?The first and obvious problem is that you have to design your program in a different way in order to make use of multi-threading. Often code is interdependent and you can't run stuff in a parallel fashion.
In this code example the value of y is directly influenced by x, so they cannot run in parallel.
This second example is different. y still depends on x and so does a, but a does not depend on y. That means that technically line 2 and 3 could run in parallel.
int x = 1;
int y = x + 1;
int a = x + 1;
Keep in mind that this is a very simplified model to explain what kind of code can be made into parallel code. Usually people put audio streaming on another thread so there are no nasty problems with I/O and slow downs of the main thread, because the audio needs to be kept updated. Even though you may not know it, but threading and parallelism in computing has been around for a long time already. If you take a look at some older audio APIs, you will notice that often audio data is fed to an audio driver which then takes care of actually playing.
What is synchronization and why do we need it?Imagine this: We have a program that reads a file in another thread and we'd like to display a progress bar.
progress = 0.0; // variable used by both threads!
start_file_reading_thread();
while (progress < 1.0)
{
render_progress_bar(progress);
}
file = open_file();
while (can_read_file(f))
{
read_data(f);
progress = get_progress(f);
}
This example is not a problem, because one thread writes the value of progress while the other one keeps reading it. As long as you have only one thread writing a value, there is no real need for synchronization if the values are "simple". Reading a progress like this is ok.
Things get more complicated when more than one thread wants to change a value or do something like that (or change complex data which I will explain in the second example).
Imagine now that we have a complex program with lots of threads. The simplest example where synchronization is needed would be if all threads write some sort of output log on how the program is doing.
print("Hello, I am thread #X"); // let's say that X get substituted with a thread ID
Now let's say that due to how the OS manages CPU time one thread is a bit faster and gets through some code, but then gets stuck for a moment, because the CPU time is allocated to another program while your other thread keeps doing its thing. The output ends up looking like this:
Quote from: output
Hello, I am threHello, I am thread #12ad #11.
Why did this happen? Because there was no synchronization. Often times you don't want things to get executed in parallel, because the threads are meddling with shared data at that point and you need to make sure that there is some sort of queue. First one thread does its thing and then another one is allowed to do its thing and all of that while the other threads are waiting. This is often called either "entering a critical section" or "accessing shared data". While in the critical section, a thread has to be guaranteed that this code block will be executed "atomically". This basically means "all at once without interruption".
There is a somewhat different example, but let's say you have some sort of complex array structure and you call a function which adds a new element to the array and updates the internal size variable accordingly. Imagine that the thread gets interrupted after adding the value, but before updating the size value. If another thread tries to read that data, it will be inconsistent and wrong. Adding synchronization here would be overkill and can slow down a program a lot, because the synchronization mechanism takes time. But synchronization is still needed. A possible solution is to use a mutex lock around adding a new element to the array and when reading it in the other place to prevent inconsistent data. And this takes us to...
How do we synchronize threads?There is something called a "mutex". I might be talking out of my ass here, but I think it was derived from the term "mutual access" which basically means shared data. Let's take the printing example from before and sync it properly.
mutex.lock();
print("Hello, I am thread #X");
mutex.unlock();
The first thread will lock the mutex and then continue. If it finishes and unlocks the mutex before the second thread tries to lock it, the second thread will do the same as the first thread (lock, print, unlock). If not, the second thread will be suspended until the first thread unlocks the mutex thus ensuring atomic execution.
What else is there to consider?When your code is compiled and executed, it looks a lot different than you may think. One line in your code can end up being several instructions on the CPU and a thread can be switched at any time, even during the execution of an instruction. Mutex locks and unlocks are guaranteed to be atomic and they will not be interrupted by another thread. So you can safely assume that if you use mutexes, the threads will be synchronized properly.
Another thing to consider is a deadlock. The simplest example of a deadlock is when a thread locks itself out.
mutex.lock();
mutex.lock(); // will get stuck here forever
A more complex example with 2 threads would be this:
mutex_for_x.lock();
mutex_for_y.lock(); // can get stuck here if thread 2 executes its line 1 first
x += 1;
y -= 1;
mutex_for_y.unlock();
mutex_for_x.unlock();
mutex_for_y.lock();
mutex_for_x.lock();
x += 1;
y -= 1;
mutex_for_x.unlock();
mutex_for_y.unlock();
If the execution order is:
- thread 1 line 1
- thread 2 line 1
- thread 1 line 2
- thread 2 line 2
then both threads will get stuck, because they locked each other out with the 2 mutexes.
Well, that seems simple enoughIn theory, yes. But in practice you often have to consider a lot of other different things. One bug we used to get is a deadlock, because we threw an exception without unlocking the mutex first. The code would jump to a different place on the stack, the mutex would stay locked and suddenly there was no audio. We also had a deadlock within a class when we called the wrong method. Basically we had to model our entire class structure in a way that there is an external interface which uses a mutex lock/unlock pair while internally we had to call internal functions to avoid a deadlock.
void AudioManager::stop(bool pause)
{
this->mutex.lock();
this->_stop(pause);
this->mutex.unlock();
}
void AudioManager::pause()
{
this->mutex.lock();
this->_stop(true); // if we called stop() directly, it would result in a deadlock
this->mutex.unlock();
}
void AudioManager::_stop(bool pause)
{
...
}
This would be about what we had to do back then. While this may not look so complicated, that's actually not what we did wrong. The problem was that we have 2 classes that interoperate within the critical section. Since we can't expose internal methods to the outside to prevent bugs, we basically had to use the "friend class" feature of C++ to get these two classes (AudioManager and AudioPlayer) to meddle with each other's internal methods and data.
Things get even more complicated when you introduce callback functions or delegates that are called from another thread, especially if you have to synchronize everything, because they change shared data. One way to make things easier is queuing calls or data in threads in a synchronized manner so that the main thread can poll for data (which is obviously also synchronized) and hide the implementation detail that the other stuff might be running in another thread which makes it easier to make single-threaded applications that depend on external multi-threading functionality.
Last few wordsDebugging threads is nasty, because it's not deterministic. During each execution the threads may be executed in a different manner and order. It pays off to sit down and think things through very well before starting the implementation, because bugs are inevitable and the fewer bugs there are in the initial threading code, the better. This is one thing I kept postponing while working on RMX-OS which ended up causing occasional deadlocks of the server. In v2.0 I redesigned the entire threading model and put mutexes everywhere to ensure that shared data is accessed properly.
This is just a guide to help understand threading and mutexes and how important synchronization is. You should look up on other resources about this topic if you are interested.