Failure of a server always occurs unexpectedly and with very serious consequences. Initially, problems with poor server performance, because of an increased workload, can be solved by increasing server capacity or by optimizing the algorithms, software codes and so on. Sooner or later, however, there comes a time when these measures turn out to be insufficient.
When server capacity is insufficient, the question arises, what should be done next? Upgrades often do not provide a proportionate increase and required resilience. If the number of requests on the site is more than one server can handle, then load balancing is needed in order to distribute the traffic across several servers. The more even the distribution, the fewer servers are needed and the better the user experience.
What is load balancing?
Load balancing improves performance and resiliency along with the productivity of websites, applications, databases and other services by distributing the workload across multiple servers.
An infrastructure without load balancing looks like this:
The user connects directly to the server. If this single server stops working, the user will not be able to get to the site. In addition, if many users try to open the site at the same time, the server may simply not keep up with the load; the site will load very slowly, or the user will not be able to open it at all. The solution to the problem is a load balancer
An infrastructure with load balancing looks like this:
Each request first goes to the balancer, which forwards it to one of the backend servers; this server, in turn, responds to the user's request and sends the result to the user. The load balancer decides on which server to send the request to by using a combination of two factors. First, the balancer determines servers that can quickly and adequately respond to requests, and then it selects one of the available servers, guided by pre-configured rules.
What traffic does the load balancer process?
It processes four main types of traffic:
• HTTP: Standard HTTP balancing distributes requests according to HTTP mechanisms.
• HTTPS: HTTPS balancing works much like HTTP, but with encryption support. Data encryption is handled in one of two ways: using SSL relay or SSL termination.
• TCP: Applications that don't use HTTP or HTTPS can distribute TCP traffic. For example, you can split database cluster traffic.
• UDP: Some load balancers have added support for the main internet protocols that use UDP (for example, DNS and syslogd).
The balancer should only send traffic to active servers able to serve it. To make the right choice, the status of servers on the backend is constantly monitored using the protocol and port specified in the rule. If the server fails the balancer check, it's automatically removed from the server pool and won't receive traffic until it passes the check. Load balancing algorithms
The three most common algorithms:
1. The load balancer
selects the least loaded server to service the traffic. Such an algorithm is especially useful if long sessions are needed to service the traffic.
2. According to the Round Robin algorithm, servers receive traffic sequentially. The balancer selects the first server on the list and sends the first request to it; the second server receives the second request, and so on.
3. The load balancer selects a server based on the hash of the original IP request (for example, based on the visitor's IP address). In this case, all requests of a particular user will be served by the same server backend.
Load balancing systems can redirect client traffic to a selected server in several ways, including Network Address Translation (NAT)
or by using the TCP gateway. Let's review each of these methods.
NAT. When using this method, the balancing system sends the received packet from the client to the designated server only after it performs several operations on the packet. First, it replaces the recipient's address with the IP address of the designated server, and then it changes the sender's IP address.
In terms of a TCP gateway, the load balancing system and the client establish a TCP connection so that the system can obtain application data before the server is assigned. The balancing tool then establishes a TCP connection with the designated server and sends it a client request. Then the balancing system sends the server a response to the client, which again uses a TCP connection. The described function is called a TCP gateway. Moreover, in the case of the TCP gateway, the balancer is able to control traffic at the L4 level and even at the application level (L7).
Solutions for network load balancing only appear to be the same at first glance. In fact, many technologies and algorithms are involved in the process, so it's impossible to find two identical products. Obvious features can be supported for certain protocols, and there are many other parameters. For example, some solutions are presented as software, and this is an undeniable advantage in the market. Furthermore, there is an advanced load balancer called an Application Delivery Controller
that can even protect against DDoS attacks. To learn more about this, see vADC