Author Topic: Multithreading and Synchronization  (Read 10308 times)

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Multithreading and Synchronization
« on: December 29, 2013, 03:24:27 PM »
Multithreading and Synchronization



What is that?

I assume that most of you know what a thread is. Basically it's a mechanism that allows code to be executed in a parallel fashion which helps speed up programs by making use of hardware intended for this purpose (such as multi-core CPUs).



How do I use it?

It depends on the programming language and what kind of API you have at your disposal, but the basic idea is that you create a thread which will execute a portion of your code. There is something called the "main thread" which is basically the thread your code is running on initially. Usually code does not run on more than one thread without you specifying it explicitely, but some OSes try to make use of threads by default (like Android) in order to speed things up on their end. From that "main thread" you can then spawn new threads and execute code in a parallel fashion.



What are the problems?

The first and obvious problem is that you have to design your program in a different way in order to make use of multi-threading. Often code is interdependent and you can't run stuff in a parallel fashion.

In this code example the value of y is directly influenced by x, so they cannot run in parallel.

Code: (cpp) [Select]
int x = 1;
int y = x + 1;
This second example is different. y still depends on x and so does a, but a does not depend on y. That means that technically line 2 and 3 could run in parallel.

Code: (cpp) [Select]
int x = 1;
int y = x + 1;
int a = x + 1;
Keep in mind that this is a very simplified model to explain what kind of code can be made into parallel code. Usually people put audio streaming on another thread so there are no nasty problems with I/O and slow downs of the main thread, because the audio needs to be kept updated. Even though you may not know it, but threading and parallelism in computing has been around for a long time already. If you take a look at some older audio APIs, you will notice that often audio data is fed to an audio driver which then takes care of actually playing.



What is synchronization and why do we need it?

Imagine this: We have a program that reads a file in another thread and we'd like to display a progress bar.

Code: (main thread, pseudo code) [Select]
progress = 0.0; // variable used by both threads!
start_file_reading_thread();
while (progress < 1.0)
{
    render_progress_bar(progress);
}
Code: (reader thread, pseudo code) [Select]
file = open_file();
while (can_read_file(f))
{
    read_data(f);
    progress = get_progress(f);
}
This example is not a problem, because one thread writes the value of progress while the other one keeps reading it. As long as you have only one thread writing a value, there is no real need for synchronization if the values are "simple". Reading a progress like this is ok.

Things get more complicated when more than one thread wants to change a value or do something like that (or change complex data which I will explain in the second example).
Imagine now that we have a complex program with lots of threads. The simplest example where synchronization is needed would be if all threads write some sort of output log on how the program is doing.

Code: (pseudo code) [Select]
print("Hello, I am thread #X"); // let's say that X get substituted with a thread ID
Now let's say that due to how the OS manages CPU time one thread is a bit faster and gets through some code, but then gets stuck for a moment, because the CPU time is allocated to another program while your other thread keeps doing its thing. The output ends up looking like this:

Quote from: output
Hello, I am threHello, I am thread #12ad #11.

Why did this happen? Because there was no synchronization. Often times you don't want things to get executed in parallel, because the threads are meddling with shared data at that point and you need to make sure that there is some sort of queue. First one thread does its thing and then another one is allowed to do its thing and all of that while the other threads are waiting. This is often called either "entering a critical section" or "accessing shared data". While in the critical section, a thread has to be guaranteed that this code block will be executed "atomically". This basically means "all at once without interruption".

There is a somewhat different example, but let's say you have some sort of complex array structure and you call a function which adds a new element to the array and updates the internal size variable accordingly. Imagine that the thread gets interrupted after adding the value, but before updating the size value. If another thread tries to read that data, it will be inconsistent and wrong. Adding synchronization here would be overkill and can slow down a program a lot, because the synchronization mechanism takes time. But synchronization is still needed. A possible solution is to use a mutex lock around adding a new element to the array and when reading it in the other place to prevent inconsistent data. And this takes us to...



How do we synchronize threads?

There is something called a "mutex". I might be talking out of my ass here, but I think it was derived from the term "mutual access" which basically means shared data. Let's take the printing example from before and sync it properly.

Code: (pseudo code) [Select]
mutex.lock();
print("Hello, I am thread #X");
mutex.unlock();
The first thread will lock the mutex and then continue. If it finishes and unlocks the mutex before the second thread tries to lock it, the second thread will do the same as the first thread (lock, print, unlock). If not, the second thread will be suspended until the first thread unlocks the mutex thus ensuring atomic execution.



What else is there to consider?

When your code is compiled and executed, it looks a lot different than you may think. One line in your code can end up being several instructions on the CPU and a thread can be switched at any time, even during the execution of an instruction. Mutex locks and unlocks are guaranteed to be atomic and they will not be interrupted by another thread. So you can safely assume that if you use mutexes, the threads will be synchronized properly.

Another thing to consider is a deadlock. The simplest example of a deadlock is when a thread locks itself out.

Code: (pseudo code) [Select]
mutex.lock();
mutex.lock(); // will get stuck here forever
A more complex example with 2 threads would be this:

Code: (thread 1, pseudo code) [Select]
mutex_for_x.lock();
mutex_for_y.lock(); // can get stuck here if thread 2 executes its line 1 first
x += 1;
y -= 1;
mutex_for_y.unlock();
mutex_for_x.unlock();
Code: (thread 2, pseudo code) [Select]
mutex_for_y.lock();
mutex_for_x.lock();
x += 1;
y -= 1;
mutex_for_x.unlock();
mutex_for_y.unlock();
If the execution order is:
  • thread 1 line 1
  • thread 2 line 1
  • thread 1 line 2
  • thread 2 line 2
then both threads will get stuck, because they locked each other out with the 2 mutexes.



Well, that seems simple enough

In theory, yes. But in practice you often have to consider a lot of other different things. One bug we used to get is a deadlock, because we threw an exception without unlocking the mutex first. The code would jump to a different place on the stack, the mutex would stay locked and suddenly there was no audio. We also had a deadlock within a class when we called the wrong method. Basically we had to model our entire class structure in a way that there is an external interface which uses a mutex lock/unlock pair while internally we had to call internal functions to avoid a deadlock.

Code: (cpp) [Select]
void AudioManager::stop(bool pause)
{
    this->mutex.lock();
    this->_stop(pause);
    this->mutex.unlock();
}

void AudioManager::pause()
{
    this->mutex.lock();
    this->_stop(true); // if we called stop() directly, it would result in a deadlock
    this->mutex.unlock();
}
void AudioManager::_stop(bool pause)
{
    ...
}
This would be about what we had to do back then. While this may not look so complicated, that's actually not what we did wrong. The problem was that we have 2 classes that interoperate within the critical section. Since we can't expose internal methods to the outside to prevent bugs, we basically had to use the "friend class" feature of C++ to get these two classes (AudioManager and AudioPlayer) to meddle with each other's internal methods and data.

Things get even more complicated when you introduce callback functions or delegates that are called from another thread, especially if you have to synchronize everything, because they change shared data. One way to make things easier is queuing calls or data in threads in a synchronized manner so that the main thread can poll for data (which is obviously also synchronized) and hide the implementation detail that the other stuff might be running in another thread which makes it easier to make single-threaded applications that depend on external multi-threading functionality.



Last few words

Debugging threads is nasty, because it's not deterministic. During each execution the threads may be executed in a different manner and order. It pays off to sit down and think things through very well before starting the implementation, because bugs are inevitable and the fewer bugs there are in the initial threading code, the better. This is one thing I kept postponing while working on RMX-OS which ended up causing occasional deadlocks of the server. In v2.0 I redesigned the entire threading model and put mutexes everywhere to ensure that shared data is accessed properly.

This is just a guide to help understand threading and mutexes and how important synchronization is. You should look up on other resources about this topic if you are interested.
« Last Edit: December 29, 2013, 03:36:22 PM by Blizzard »
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline locowhiteknight

  • Awakened Visionist
  • **
  • Posts: 91
  • LV: 5
  • Gender: Male
    • View Profile
Re: Multithreading and Synchronization
« Reply #1 on: December 31, 2013, 07:09:02 PM »
Great post, Blizzard! I remember reading about multithreading in my Dietel book when I was taking a C# class back in the day. I never got around to really learning much about them, but I'm sure that's soon to change. Right now I working my way through this video tutorial for game programming:
(click to show/hide)
This post kind of goes "hand in hand" with some of things he's covering in first few videos of the series.
"Let's get down to brass tacks. How much for the ape?"

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Multithreading and Synchronization
« Reply #2 on: January 01, 2014, 03:02:52 PM »
At first I was afraid (I was petrified!) that it might be too abstract or not detailed enough, but I wanted to keep it simple and easy to understand even if I sacrificed some important implementation details. The general idea is to make it easier for people to start implementing parallel code. I'm glad that it turned out good. :)
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline locowhiteknight

  • Awakened Visionist
  • **
  • Posts: 91
  • LV: 5
  • Gender: Male
    • View Profile
Re: Multithreading and Synchronization
« Reply #3 on: January 16, 2014, 03:08:46 AM »
Is there any chance you could do one covering hash tables/functions in the future?
"Let's get down to brass tacks. How much for the ape?"

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Multithreading and Synchronization
« Reply #4 on: January 16, 2014, 08:43:22 AM »
Yes, of course. Remind me this weekend. I was going to make one about how server and client communicate, explain a bit what a "handshake" is, etc. the last weekend, but I was a bit too hungover from Friday night. But if there's more interest for something more more practical (and more often used), I can explain how hash tables work on an abstract level and how they compare to dynamic arrays.
« Last Edit: January 16, 2014, 08:44:27 AM by Blizzard »
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline locowhiteknight

  • Awakened Visionist
  • **
  • Posts: 91
  • LV: 5
  • Gender: Male
    • View Profile
Re: Multithreading and Synchronization
« Reply #5 on: January 19, 2014, 11:37:53 PM »
Quote
I can explain how hash tables work on an abstract level and how they compare to dynamic arrays.


That's what I'm trying to wrap my head around I guess. Would it be possible to show the implementation in C or C++? I have a feeling it might be more intuitive for me to follow in one of those languages, preferably the latter. Thanks for considering making a topic on this.

Quote
I was going to make one about how server and client communicate, explain a bit what a "handshake" is, etc.
 

This one sounds interesting as well. I got into learning a little bit about this in C#, while covering how sockets work.
"Let's get down to brass tacks. How much for the ape?"

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Multithreading and Synchronization
« Reply #6 on: January 20, 2014, 01:19:38 AM »
Trust me, it's conceptually simpler than its implementation, especially in C. xD

Sockets are not that complicated. But because of that, they are not "powerful" either. So when you use them, you usually have to implement most stuff. e.g. sending/receiving data in a different thread, timeout mechanisms, block-by-block reading mechanisms (often you don't get all data at once), etc.
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.