C Programming System Calls

Linux System Call Tutorial with C

In our last article on Linux System Calls, I defined a system call, discussed the reasons one might use them in a program, and delved into their advantages and disadvantages. I even gave a brief example in assembly within C. It illustrated the point and described how to make the call, but did nothing productive. Not exactly a thrilling development exercise, but it illustrated the point.

In this article, we’re going to use actual system calls to do real work in our C program. First, we’ll review if you need to use a system call, then provide an example using the sendfile() call that can dramatically improve file copy performance. Finally, we’ll go over some points to remember while using Linux system calls.

Do You Need a System Call?

While it’s inevitable you’ll use a system call at some point in your C development career, unless you are targeting high performance or a particular type functionality, the glibc library and other basic libraries included in major Linux distributions will take care of the majority of your needs.

The glibc standard library provides a cross-platform, well-tested framework to execute functions that would otherwise require system-specific system calls. For example, you can read a file with fscanf(), fread(), getc(), etc., or you can use the read() Linux system call. The glibc functions provide more features (i.e. better error handling, formatted IO, etc.) and will work on any system glibc supports.

On the other hand, there are times where uncompromising performance and exact execution are critical. The wrapper that fread() provides is going to add overhead, and although minor, isn’t entirely transparent. Additionally, you may not want or need the extra features the wrapper provides. In that case, you’re best served with a system call.

You can also use system calls to perform functions not yet supported by glibc. If your copy of glibc is up to date, this will hardly be an issue, but developing on older distributions with newer kernels might require this technique.

Now that you’ve read the disclaimers, warnings, and potential detours, now let’s dig into some practical examples.

What CPU Are We On?

A question that most programs probably don’t think to ask, but a valid one nonetheless. This is an example of a system call that cannot be duplicated with glibc and isn’t covered with a glibc wrapper. In this code, we’ll call the getcpu() call directly via the syscall() function. The syscall function works as follows:

syscall(SYS_call, arg1, arg2,);

The first argument, SYS_call, is a definition that represents the number of the system call. When you include sys/syscall.h, these are included. The first part is SYS_ and the second part is the name of the system call.

Arguments for the call go into arg1, arg2 above. Some calls require more arguments, and they’ll continue in order from their man page. Remember that most arguments, especially for returns, will require pointers to char arrays or memory allocated via the malloc function.

example1.c

#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/types.h>
 
int main() {
 
    unsigned cpu, node;
 
    // Get current CPU core and NUMA node via system call
    // Note this has no glibc wrapper so we must call it directly
    syscall(SYS_getcpu, &cpu, &node, NULL);
 
    // Display information
    printf("This program is running on CPU core %u and NUMA node %u.\n\n", cpu, node);
 
    return 0;
 
}
 
To compile and run:
 
gcc example1.c -o example1
./example1

For more interesting results, you could spin threads via the pthreads library and then call this function to see on which processor your thread is running.

Sendfile: Superior Performance

Sendfile provides an excellent example of enhancing performance through system calls. The sendfile() function copies data from one file descriptor to another. Rather than using multiple fread() and fwrite() functions, sendfile performs the transfer in kernel space, reducing overhead and thereby increasing performance.

In this example, we’re going to copy 64 MB of data from one file to another. In one test, we’re going to use the standard read/write methods in the standard library. In the other, we’ll use system calls and the sendfile() call to blast this data from one location to another.

test1.c (glibc)

#include <stdio.h>
#include <stdlib.h>
#include <sys/file.h>
#include <sys/random.h>
 
#define BUFFER_SIZE 67108864
#define BUFFER_1 "buffer1"
#define BUFFER_2 "buffer2"
 
int main() {
 
    FILE *fOut, *fIn;
 
    printf("\nI/O test with traditional glibc functions.\n\n");
 
    // Grab a BUFFER_SIZE buffer.
    // The buffer will have random data in it but we don't care about that.
    printf("Allocating 64 MB buffer:                     ");
    char *buffer = (char *) malloc(BUFFER_SIZE);
    printf("DONE\n");
 
    // Write the buffer to fOut
    printf("Writing data to first buffer:                ");
    fOut = fopen(BUFFER_1, "wb");
    fwrite(buffer, sizeof(char), BUFFER_SIZE, fOut);
    fclose(fOut);
    printf("DONE\n");
 
    printf("Copying data from first file to second:      ");
    fIn = fopen(BUFFER_1, "rb");
    fOut = fopen(BUFFER_2, "wb");
    fread(buffer, sizeof(char), BUFFER_SIZE, fIn);
    fwrite(buffer, sizeof(char), BUFFER_SIZE, fOut);
    fclose(fIn);
    fclose(fOut);
    printf("DONE\n");
 
    printf("Freeing buffer:                              ");
    free(buffer);
    printf("DONE\n");
 
    printf("Deleting files:                              ");
    remove(BUFFER_1);
    remove(BUFFER_2);
    printf("DONE\n");
 
    return 0;
 
}

test2.c (system calls)

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/file.h>
#include <sys/sendfile.h>
#include <sys/random.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/fcntl.h>
 
#define BUFFER_SIZE 67108864
 
int main() {
 
    int fOut, fIn;
 
    printf("\nI/O test with sendfile() and related system calls.\n\n");
 
    // Grab a BUFFER_SIZE buffer.
    // The buffer will have random data in it but we don't care about that.
    printf("Allocating 64 MB buffer:                     ");
    char *buffer = (char *) malloc(BUFFER_SIZE);
    printf("DONE\n");
 

    // Write the buffer to fOut
    printf("Writing data to first buffer:                ");
    fOut = open("buffer1", O_RDONLY);
    write(fOut, &buffer, BUFFER_SIZE);
    close(fOut);
    printf("DONE\n");
 
    printf("Copying data from first file to second:      ");
    fIn = open("buffer1", O_RDONLY);
    fOut = open("buffer2", O_RDONLY);
    sendfile(fOut, fIn, 0, BUFFER_SIZE);
    close(fIn);
    close(fOut);
    printf("DONE\n");
 
    printf("Freeing buffer:                              ");
    free(buffer);
    printf("DONE\n");
 
    printf("Deleting files:                              ");
    unlink("buffer1");
    unlink("buffer2");
    printf("DONE\n");
 
    return 0;
 
}

Compiling and Running Tests 1 & 2

To build these examples, you will need the development tools installed on your distribution. On Debian and Ubuntu, you can install this with:

apt install build-essentials

Then compile with:

gcc test1.c -o test1 && gcc test2.c -o test2

To run both and test the performance, run:

time ./test1 && time ./test2

You should get results like this:

I/O test with traditional glibc functions.

Allocating 64 MB buffer:                     DONE
Writing data to first buffer:                DONE
Copying data from first file to second:      DONE
Freeing buffer:                              DONE
Deleting files:                              DONE
real    0m0.397s
user    0m0.000s
sys     0m0.203s
I/O test with sendfile() and related system calls.
Allocating 64 MB buffer:                     DONE
Writing data to first buffer:                DONE
Copying data from first file to second:      DONE
Freeing buffer:                              DONE
Deleting files:                              DONE
real    0m0.019s
user    0m0.000s
sys     0m0.016s

As you can see, the code that uses the system calls runs much faster than the glibc equivalent.

Things to Remember

System calls can increase performance and provide additional functionality, but they are not without their disadvantages. You’ll have to weigh the benefits system calls provide against the lack of platform portability and sometimes reduced functionality compared to library functions.

When using some system calls, you must take care to use resources returned from system calls rather than library functions. For example, the FILE structure used for glibc’s fopen(), fread(), fwrite(), and fclose() functions are not the same as the file descriptor number from the open() system call (returned as an integer). Mixing these can lead to issues.

In general, Linux system calls have fewer bumper lanes than glibc functions. While it’s true that system calls have some error handling and reporting, you’ll get more detailed functionality from a glibc function.

And finally, a word on security. System calls directly interface with the kernel. The Linux kernel does have extensive protections against shenanigans from user land, but undiscovered bugs exist. Don’t trust that a system call will validate your input or isolate you from security issues. It is wise to ensure the data you hand to a system call is sanitized. Naturally, this is good advice for any API call, but you cannot be to careful when working with the kernel.

I hope you enjoyed this deeper dive into the land of Linux system calls.

About the author

Robert Oliver

Robert Oliver

Writer, System Admin, Full Stack Developer, Philosopher.
https://rwo2.com