Load balancing ensures that the workload is evenly distributed to prevent any single node from becoming overwhelmed while the others remain underutilized.
How Does Load Balancing Work?
Although the process of load balancing is more technical than we can manage to explain, the following provides a high-level overview.
Milvus implements load balancing through a component called proxy. Proxy is a gateway for client requests which receives the incoming queries and distribute them across the available Milvus instances. It analyzes each instance’s current state and load, and intelligently routes the queries to the most suitable node.
The load balancing mechanism in Milvus considers various factors such as the computational capability of each node, the current workload of each node, network latency between the proxy, and more.
By considering these factors, the load balancer ensures that the requests are evenly distributed to prevent bottlenecks and maximize the overall throughput and response time.
In this tutorial, we will explore how we can use the load_balance command from the Milvus CLI to invoke a load balancing on the Milvus cluster manually.
Requirements:
Ensure that you have the following:
- Access to a Milvus cluster
- Installed Milvus CLI on the server
Milvus CLI Load_Balance Command
The load_balance command in the Milvus CLI allows you to perform the load balancing by transferring the segments from a source query node to a destination one.
The command syntax is shown in the following:
The command parameters are explained as shown in the following:
- -s – The ID of the source query node to be balanced.
- -d – (Multiple) The ID of the destination query node to transfer the segments to.
- -ss – (Multiple) The ID of the sealed segment to be transferred.
- -t – The timeout in seconds.
For example, we can load the balance from node 3 to node 4 as shown in the following command:
PyMilvus Load Balance
You can also use the PyMilvus package to perform the load balancing on a Milvus cluster.
Start by fetching the segment ID of the sealed segment that you expect to transfer and the nodeID of the query node to which you expect to transfer the segment.
utility.get_query_segment_info("car")
Next, transfer the sealed segment(s) with the segment ID and the nodeID of the current query node and new query node(s).
src_node_id=3,
dst_node_ids=[4],
sealed_segment_ids=[431067441441538050]
)
This should perform the load-balancing operation using the specified parameters.
Conclusion
In this post, we discussed how we could use the load_balance command from the Milvus CLI and the PyMilvus package to initialize.