Create a Thread Pool in C++

A thread pool is a set of threads where each thread has a kind of task to carry out. So different threads carry out different kinds of tasks. So each thread has its specialization of tasks. A task is basically a function. Similar functions are done by a particular thread; a different similar set of functions are done by another thread, and so on. Though an executing thread executes a top-level function, a thread by definition is the instantiation of an object from the thread class. Different threads have different arguments, so a particular thread should attend to a similar set of functions.

In C++, this thread pool has to be managed. C++ does not have a library for creating a thread pool and is management. This is probably because there are different ways of creating a thread pool. So, a C++ programmer has to create a thread pool based on the needs.

What is a thread? A thread is an object instantiated from the thread class. In normal instantiation, the first argument of the thread constructor is the name of a top-level function. The rest of the arguments to the thread constructor are arguments for the function. As the thread is instantiated, the function starts executing. The C++ main() function is a top-level function. Other functions in that global scope are top-level functions. It happens that the main() function is a thread that does not need formal declaration as other threads do. Consider the following program:

#include <iostream>
#include <thread>
using namespace std;

void func() {
cout << "code for first output" << endl;
cout << "code for second output" << endl;
}

int main()
{
thread thr(func);
thr.join();
/* other statements */

return 0;
}

The output is:

code for first output
code for second output

Note the inclusion of the thread library that has the thread class. func() is a top-level function. The first statement in the main() function uses it in the instantiation of the thread, thr. The next statement in main(), is a join statement. It joins the thread thr to the body of the main() function thread, at the position it is coded. If this statement is absent, the main function might execute to completion without the thread function completing. That means trouble.

A command similar to the following, should be used to run a C++20 program of threads, for the g++ compiler:

g++ -std=c++2a temp.cpp -lpthread -o temp

This article explains one way of creating and managing a thread pool in C++.

Thread Pool Example Requirements

The requirements for this illustrative thread pool are simple: There are three threads and one master thread. The threads are subordinate to the master thread. Each subordinate thread works with a queue data structure. So there are three queues: qu1, qu2, and qu3. The queue library, as well as the thread library, have to be included in the program.

Each queue can have more than one function call but of the same top-level function. That is, each element of a queue is for a function call of a particular top-level function. So, there are three different top-level functions: one top-level function per thread. The function names are fn1, fn2 and fn3.

The function calls for each queue differs only in their arguments. For simplicity and for this program example, the function calls will have no argument. In fact, the value of each queue in this example will be the same integer: 1 as the value for all the qu1 elements; 2 as the value for all the qu2 elements; and 3 as the value for all the qu3 elements.

A queue is a first_in-first_out structure. So the first call (number) to enter a queue is the first to leave. When a call (number) leaves, the corresponding function, and its thread are executed.

The main() function is responsible for feeding each of the three queues, with calls for the appropriate functions, hence appropriate threads.

The master thread is responsible for checking if there is a call in any queue, and if there is a call, it calls the appropriate function through its thread. In this program example, when no queue has any thread, the program ends.

The top-level functions are simple, for this pedagogic example, they are:

void fn1() {
cout << "fn1" << endl;
}

void fn2() {
cout << "fn2" << endl;
}

void fn3() {
cout << "fn3" << endl;
}

The corresponding threads will be thr1, thr2, and thr3. The master thread has its own master function. Here, each function has just one statement. The output of the function fn1() is “fn1”. The output of the function fn2() is “fn2”. The output of the function fn3() is “fn3”.

At the end of this article, the reader can put together all of the code segments in this article to form a thread pool program.

Global Variables

The top of the program with the global variables, is:

#include <iostream>
#include <thread>
#include <queue>
using namespace std;

queue<int> qu1;
queue<int> qu2;
queue<int> qu3;

thread thr1;
thread thr2;
thread thr3;

The queue and thread variables are global variables. They have been declared without initialization or declaration. After this, in the program, should be the three subordinate top-level functions, as shown above.

The iostream library is included for the cout object. The thread library is included for the threads. The names of the threads are thr1, thr2, and thr3. The queue library is included for the queues. The names of the queues are qu1, qu2 and qu3. qu1 corresponds to thr1; qu2 corresponds to thr2, and qu3 corresponds to thr3. A queue is like a vector, but it is for FIFO (first_in-first_out).

The Master Thread Function

After the three subordinate top-level functions are the master function in the program. It is:

void masterFn() {
work:
if (qu1.size() > 0) thr1 = thread(fn1);
if (qu2.size() > 0) thr2 = thread(fn2);
if (qu3.size() > 0) thr3 = thread(fn3);

if (qu1.size() > 0) {
qu1.pop();
thr1.join();
}
if (qu2.size() > 0) {
qu2.pop();
thr2.join();
}
if (qu3.size() > 0) {
qu3.pop();
thr3.join();
}

if (qu1.size() == 0 && qu1.size() == 0 && qu1.size() == 0)
return;
goto work;
}

The goto-loop embodies all the code of the function. When all the queues are empty, the function returns void, with the statement, “return;”.

The first code segment in the goto-loop has three statements: one for each queue and the corresponding thread. Here, if a queue is not empty, its thread (and corresponding subordinate top-level function) is executed.

The next code segment consists of three if-constructs, each corresponding to a subordinate thread. Each if-construct has two statements. The first statement removes the number (for the call), that might have taken place in the first code segment. The next is a join statement, which makes sure the corresponding thread works to completion.

The last statement in the goto-loop ends the function, going out of the loop if all the queues are empty.

Main() Function

After the master thread function in the program, should be the main() function, whose content is:

qu1.push(1);
qu1.push(1);
qu1.push(1);

qu2.push(2);
qu2.push(2);

qu3.push(3);

thread masterThr(masterFn);
cout << "Program has started:" << endl;
masterThr.join();
cout << "Program has ended." << endl;

The main() function is responsible for putting numbers that represent calls into the queues. Qu1 has three values of 1; qu2 has two values of 2, and qu3 has one value of 3. The main() function starts the master thread and joins it to its body. An output of the author’s computer is:

Program has started:
fn2
fn3
fn1
fn1
fn2
fn1
Program has ended.

The output shows the irregular concurrent operations of threads. Before the main() function joins its master thread, it displays "Program has started:". The master thread calls thr1 for fn1(), thr2 for fn2() and thr3 for fn3(), in that order. However, the corresponding output begins with “fn2”, then “fn3”, and then “fn1”. There is nothing wrong with this initial order. That is how concurrency operates, irregularly. The rest of the output strings appear as their functions were called.

After the main function body joined the master thread, it waited for the master thread to complete. For the master thread to complete, all the queues have to be empty. Each queue value corresponds to the execution of its corresponding thread. So, for each queue to become empty, its thread has to execute for that number of times; there are elements in the queue.

When the master thread and its threads have been executed and ended, the main function continues to execute. And it displays, “Program has ended.”.

Conclusion

A thread pool is a set of threads. Each thread is responsible for carrying out its own tasks. Tasks are functions. In theory, the tasks are always coming. They do not really end, as illustrated in the above example. In some practical examples, data is shared between threads. To share data, the programmer needs knowledge of conditional_variable, asynchronous function, promise, and future. That is a discussion for some other time.