Apache Kafka

Apache Kafka Streams Peek Transformations

Kafka Streams transformations refer to a set of operations that we can carry out on Kafka Streams data which allows us to modify and process the data to our liking. Transformations allow us to perform actions such as transforming, filtering, aggregating, joining, and analyzing the data from a specific Kafka stream.

The main advantage of Kafka Streams transformations is that we can carry them out in real-time. This allows us to read the data, make the modifications, and write it back to a Kafka topic.

Luckily, Kafka Streams support a wide range of transformations including:

Map Transformations – These allow us to apply a given function to each record in the stream, transforming it into a new record.

Filter Transformations – As the name suggests, filter transformations allow us to remove records from a stream that do or do not match a given condition.

GroupBy – We also have the GroupBy transformations. These types of transformations allow us to group the records in a given stream based on a specific key.

Aggregation – Yes, Kafka Streams do support aggregation transformations. These types of transformations allow us to group a set of records from a stream and perform aggregate actions such as sum, count, average, etc.

Join – Finally, the last type of transformation that is supported by Kafka streams is the join type. This allows us to combine two or more streams based on a given key or simply concatenate the values of the input streams to output a new stream.

As you can see, Kafka Streams transformations enable a real-time data processing and analysis which allows you to read and make decisions based on the most up-to-date data available.

However, in this tutorial, we will focus on one type of transformation in Kafka Streams called a “peek transformation”.

If you want to learn more about the other types of Kafka Streams transformation that are previously mentioned, you can check out the other tutorials to learn more.

Kafka Streams Peek Transformations

In Kafka Streams, the peek transformation allows us to intercept and inspect the records from a given stream without modifying them. One advantage of the peek transformation is that it is quick and efficient which allows you to perform actions such as logging and printing the records from a given stream without application overhead.

This makes it a fantastic feature for development and testing purposes.

In Kafka Streams DSL API, we use the peek() method to carry out a peek transformation. The method definition is as follows:

KStream<K, V> peek(ForeachAction<? super K, ? super V> action)

This method accepts a KStream type and an action which are performed on each record within the input stream. It is good to note that the action parameter is a lambda expression or method reference that takes the key and value of the record in the stream.

NOTE: It is good to keep in mind that the peek() operation does not modify the input stream. Instead, it returns a new stream which contains very similar records as the input stream.

Example Demonstration:

This example demonstrates how we can use the peek() transformation in Kafka Streams to print the records from an input stream to the console:

import org.apache.kafka.streams.*;
import org.apache.kafka.streams.kstream.*;
import java.util.Properties;

public class GeoDataPeekExample {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(StreamsConfig.APPLICATION_ID_CONFIG, "geo-data-peek-example");
        props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        StreamsBuilder builder = new StreamsBuilder();
        KStream geoDataStream = builder.stream("geodata");
        geoDataStream.peek((key, value) -> System.out.println("Record: key=" + key + ", value=" + value));
        Topology topology = builder.build();
        KafkaStreams streams = new KafkaStreams(topology, props);
        streams.start();
        Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
    }
}

The provided example starts by setting up a Kafka Stream instance that reads from a Kafka topic called “geodata”. We then use the peek() transformation to intercept the data and print each record to the console.

Conclusion

We discussed what are Kafka transformations, the various types of Kafka transformations, their role, and more importantly, how to work with the peek() transformation in Apache Kafka.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list