Audit Logs in Apache Kafka

When working in an Apache Kafka cluster, you will encounter various scenarios where you need to troubleshoot and fix the problems or optimize the functionality of the Kafka cluster.

This is where the log auditing come into play. In this tutorial, we will walk you through the process of log auditing in Apache Kafka in a manual technique. Learning how to read and interpret the Kafka logs manually can help you understand the functionality of your cluster and how it works. It can also allow you to quickly automate the log audit process using automation tools such as Elasticsearch.

What Is Log Auditing?

Let us start with the basics and discuss what log auditing is. We can define the log auditing as monitoring and collecting the log data, typically from various sources such as servers, applications, and network devices, to identify the security incidents, troubleshoot issues, and maintain compliance with requirements.

Once we collect the log data, we can analyze it and attempt to find any issues or inconsistencies based on a given set of rules. It also allows us to find suspicious or malicious activities in the server such as locating password brute forcing and more.

Log Auditing in Kafka

In Kafka, we might require the functionality of a log audit to perform various actions. In Kafka, log auditing can be used for several purposes including the following:

Security monitoring – We can use the log auditing to collect and analyze the logs from Kafka and its components. We can then use this data to identify and respond to security threats and suspicious activities such as unauthorized access attempts or data breaches.

Troubleshooting – The second most common use case of log auditing is diagnosing the problems with Kafka by providing a detailed view of its operations and performance which can aid in identifying the root cause of issues.

Performance tuning – Another typical case of log auditing is getting the valuable insight into the performance of Kafka including the rate of data production and consumption, the rate of messages being processed, and the utilization of resources such as disk and memory. We can then leverage this information to create tools and techniques on how to enhance the performance of Kafka.

Important Logs in Kafka

Although the log files that are available in your Kafka cluster will vary depending on the installed version, the following are some log files that you will find useful in your cluster:

server.log – As the name suggests, this log file contains the log messages for the main server. This includes the log messages to start and stop the Kafka server. Depending on the set log level, it may contain messages such as configuration and errors on the server.

controller.log – The following log file that you need to be aware of is the “server.log” file. This log file holds the log information about the actions of the Kafka controller including changes to partitions and replicas.

kafka-request.log – This log file, on the other hand, is used to handle the log information about the client request that comes to the Kafka broker. This includes information such as the client type, client ID, response code, etc.

state-change.log – This log file stores the logs about the changes that are made to the partitions, brokers, replicas, etc.

zookeeper.gc.log – This log file is used to store the log information about the Zookeeper manager.

The given log files are some logs that you need to familiarize yourself with to read and audit the logs from the Kafka server.

By default, these log files are stored in the KAFKA_HOME/logs directory.

Reading the Log Files

The main component of log auditing is learning to read the log files. The good thing is that Apache Kafka uses the log4j log formatting. This means that we can easily understand the format from the open documentation.

For this post, we will focus on the “server.log” as it holds the information from the Kafka server. Let us take a look at an example entry from the “server.log” file:

[2023-02-07 02:53:09,718] INFO Registered broker 0 at path /brokers/ids/0 with addresses: PLAINTEXT://v.broker.io:9092, czxid (broker epoch): 25 (kafka.zk.KafkaZkClient)

The previous example is a single entry from the Kafka “server.log” file.

This is what is known as a combined log format. This log format is found in many Apache applications including Kafka and Apache.

The combined log format allows the application to provide a clear and concise log information including the time that the log was generated, the severity of the log message, the content, and the source.

The previous entry contains information such as:

Timestamp – This denotes the time at which the log was written. This follows the YYYY—MM-DD:HH:MM:SS,ms format.

Log Level – The next section denotes the log level that is used to generate the message. In our example, the log level is set to INFO.

Message – The third section shows the log message.

Source – Finally, the last entry shows the source of the log message. In this case, the kafka.zk.KafkaZkClient.

Conclusion

There you have it! An easy way to dissect, read, and understand the log information that is produced by Kafka applications.