Apache Kafka

How to Change the Replication Factor in Apache Kafka

When working within a Kafka topic, you may encounter such instances where you need to modify the replication factor of the topic. In this tutorial, we will describe how to modify the replication factor of an existing Kafka topic.

Kafka Replication Factor

In Kafka, the replication factor refers to the number of replicas of a given partition in a Kafka cluster. The replication factor determines the number of nodes in the cluster that stores the copies of the same partition, providing fault tolerance and high data availability in case of node failure.

As mentioned, the replication factor is assigned at the topic level which allows Kafka to determine the data storage and distribution techniques based on the specified replication factor.

It is good to ensure that the specified replication factor is less than or equal to the total number of brokers in the cluster. You will often encounter a common practice to set a replication factor of 2 or 3 for critical topics to ensure that the data can be recovered even if a single broker goes down.

NOTE: Setting the replication factor of a Kafka topic to 1 means only one replica of each partition in the cluster. This means that the data for a partition is stored on a single broker, and there is no redundancy or backup. If the broker that hosts the partition fails, the data for that partition becomes unavailable.

Using a replication factor of 1 in production environments is highly risky and heavily discouraged. This is because it can lead to data loss if a broker fails and can negatively impact the overall availability of the system.

Setting the replication factor of a topic to 3 is considered the ideal value as it provides a balance between data redundancy, fault tolerance, and storage options.

Change the Replication Factor in Kafka

Let us now explore how we can change the replication factor of a given topic. We will create a sample topic for demonstration purposes as shown in the following command:

kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic users

The previous command creates a Kafka topic called “users” with a replication factor of 1 and 1 partition.

Let us start by describing the topic with the following command:

kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic users

This should return the details of the partition as shown in the following:

Topic: users TopicId: 3O-eOlBFQGeGPJ1MmOd5cw PartitionCount: 1 ReplicationFactor: 1 Configs:
Topic: users Partition: 0 Leader: 0 Replicas: 0 Isr: 0

To change the replication of a given topic, we need to create a JSON file with the reassignment details. The file syntax is as shown in the following:

{
    "version": 1,
    "partitions": [
        {
            "topic": "topic-name",
            "partition": 0,
            "replicas": [1, 2, 3]
        }    ]
}

The partitions field contains an array of objects which represents each partition in the topic. The replicas field specifies the new set of brokers to be used as replicas for the partition. The number of objects in the partitions array should match the number of partitions in the topic, and the number of elements in the replicas array should match the desired replication factor.

For example, to increase the number of replicas of the “users” topic to 3, we can use the file format as shown in the following:

{
    "version": 1,
    "partitions": [
        {
            "topic": "users",
            "partition": 0,
            "replicas": [0,1,2]
        }    ]
}

Note that we start with the number of replicas from 0 (from the describe command).

We can then apply the reassignment with the following command:

kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --reassignment-json-file update_replicas.json --execute

This should increase the replication factor to 3 for the specified topic.

You can use the described command to verify that the changes are applied on the target topic.

Conclusion

You now learned how to use a JSON file to update the replication factor of a given topic in Apache Kafka.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list