Apache Kafka

How to Get the Producer Metrics in Apache Kafka

Kafka producers are some of the most fundamental building blocks. They make up the main functionality of a Kafka cluster which is writing messages to a Kafka topic. As administrators, we may need to gather information about various components in a Kafka cluster. We can use the gathered information to make decisions such as troubleshooting or determining the perfect upgrade path.

Such information that we can gather is the producer metrics. These refer to the various measurements which are collected by a Kafka producer to monitor and evaluate the performance and efficiency of its message production process.

Once we gather these metrics, we can use them to identify the potential bottlenecks, understand the rate and speed at which the messages are written to a Kafka broker, etc. We also use the producer metrics to track the success or failure of the message delivery.

Key Producer Metrics

Some of the key producer metrics include:

Record Send Rate – This metric measures the rate at which records are sent to the broker. We can use this metric to determine the rate of message production.

Record Size – The second producer metric in the Kafka cluster is the record size. This metric allows us to measure the records that are sent to the broker. If you need to determine the network and disk I/O, this is probably the metric that you are looking for.

Record Error Rate – The error rate metric measures the rate at which record send operations result in errors. An example usage of this metric is determining the success rate for the message delivery. If there are any issues between the broker and the producer application, this is a good metric to assist with that.

Batch Size – This metric allows us to determine the size of the batch records that are sent to the broker.

Compression Ratio – Using the compression ratio metric, we can calculate the compression ratio that is achieved by the producer application.

Network I/O – This metric allows us to get the amount of the network I/O which is generated by the producer. If you need to determine network bottlenecks, this is probably the place to start.

Request Latency – We can use this producer metric to get the time that the producer takes to receive a response from the broker for a record send request.

These metrics, along with others, provide valuable insights into the performance and efficiency of the Kafka producer. By monitoring these metrics, the organizations can ensure that their Kafka producer runs optimally and uses the resources efficiently.

Using this tutorial, we will build a simple Python application to help us gather the producer metrics using the Kafka-Python library.

Kafka Producer Metrics

We can use the Kafka-Python library to gather the metrics of a producer and dump the result as JSON in the following source code:

import json

from kafka import KafkaProducer

def gather_producer_metrics(producer):

  # Get the metrics for the producer

  metrics = producer.metrics()

  # Convert the metrics dictionary to a JSON string

  metrics_json = json.dumps(metrics)

  # Print the producer metrics as JSON

  print(metrics_json)

# Usage

producer = KafkaProducer(

  bootstrap_servers='localhost:9092',

  client_id='my-producer-client'

)

gather_producer_metrics(producer)

This project setups a simple Kafka producer and uses the metrics() function to show the various metrics of the producer client.

We can run the following code:

$ python producer_metrics.py

To view the results in a human-readable format, we can pipe the output to JQ as shown in the following:

Conclusion

There you have it! A simple way of using Python’s Kafka-Python package to gather the metrics of a producer application in Python.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list