Load balancing

Click here to expand Table of Contents

Load balancing is the method of sending requests to different servers according to some algorithm, such that the load across the servers as a whole is fairly even on each individual server.

This is useful for environments that see a very high number of calls per second, so that each individual server would see a much lower number of calls per second. Other uses would include distributing load so that you don't fill up one box first then go to the next, and if you have a failure on that first box you wouldn't lose all of your calls.

There are several algorithms available for load balancing. The more common ones are:

round robbin where you allocate in turn to each server regardless of how many a given server has already. This does not work well since you may see one server getting many long calls while others get fairly short ones and you end up with an unbalanced allocation.
load based routing is where you would route new requests to the server with the lowest load. Load can be defined by actual CPU utilization, number of concurrent connections already in place, or some other means of determining how busy a given server is.