Modern web applications have to take thousands and millions of requests from users or clients and return the correct text, images, video, or application data, all in a fast and reliable manner. To cost‑effectively scale to meet these high volumes, modern computing best practice generally requires adding more servers.
A load balancer is also called the “traffic cop” because it monitors your servers and routes client requests across all servers capable of fulfilling those requests. They work to maximize speed, capacity utilization, and ensures that no one server is overworked, which could degrade performance. Load balancing technologies are needed to redirect traffic to available online servers if there are other servers that are down. Load balancing is needed in any high traffic site but to understand what load balancing is, we first have to take a look at a network system.
The Open Systems Interconnection (OSI) Reference Model is a framework that divides data communication into seven layers where Load balancing can be found on layers L4 – L7, as it’s used for handling incoming network loads which will be important later on. For now, it’s important to know what Layer 4 and 7 are:
L4 – Transport/Network layer deals with the IP address and TCP/UDP ports to direct traffic.
L7 – The Application layer deals with the HTTP header, uniform resource identifier, SSL session ID, Domain Name Systems (DNS), and HTML form data to direct traffic.
Benefits of Load Balancing for Applications
These are some big advantages associated with load balancing.
- Scalability – Because a load balancer will spread the work evenly throughout available servers, this allows for increased scalability.
- Redundancy – When application traffic is sent to two or more web servers, and one server fails, then the load balancer will automatically transfer the traffic to the other working servers – perfect for automation.
- Flexibility – With load balancing, one server will always be available to pick up different application loads, so admins are flexible in maintaining other servers. You have the flexibility of having a staggered maintenance system, where at least one server is always available to pick up the workload while others are undergoing maintenance, making sure the site’s users do not experience any outages at any time.
- Security – Load balancers can also act as an extra measure of security. An application load balancer can be used to prevent denial of service (DDoS) attacks. With an application load balancer, network and application traffic from the corporate server is “offloaded” to a public cloud server or provider, thus protecting the traffic from interference from dangerous cyber attacks.
- Session Persistence – This is the ability to make sure that a user’s session data goes to one server throughout the user’s session. If the server changes midway, it will cause performance issues and the data will not be saved. Being able to handle tons of data being saved is one huge benefit if you know how to.
- Global Server Load Balancing – Global Server Load Balancing extends L4 and L7 capabilities to servers in different geographic locations. More enterprises are seeking to deploy cloud-native applications in data centers and public clouds to deal with cloud load.
How Load Balancing Technology Works
Load balancing is the load distribution of network traffic across multiple back-end servers. And a load balancer makes sure that no single server will overload. Because the application load is spread throughout different servers, this increases the responsiveness of web applications, this also makes for a better user experience.
A load balancer will manage incoming requests being sent between one server and the end-users device. This server could be on-premises, in a data center, or on a public cloud. Load balancers will also conduct continuous health checks on servers to ensure they can handle requests. If necessary, the load balancer removes unhealthy servers from the server farm until they are restored. Some load balancers even trigger the creation of new virtualized application servers to cope with increased demand. They can also be incorporated into application delivery controllers (ADCs) to improve performance and security more broadly.
There are some critical aspects of load balancing that help web applications in being stable and handle network traffic. Some of these critical tasks include: managing traffic spikes and preventing network load from overtaking one server, minimizing client request response time, and ensuring performance and reliability of compute resources.
Software Load Balancers vs. Hardware Load Balancers
Load balancers run as hardware appliances or are software-defined. Hardware appliances often run proprietary software optimized to run on custom CPUs. As traffic increases, the vendor simply adds more load balancing appliances to handle the volume. Software defined load balancers usually run on less-expensive, standard Intel x86 hardware. Installing the software in cloud environments like AWS EC2 eliminates the need for a physical appliance.
Load Balancing Method Techniques & Optimizations
Load balancing algorithms take into account whether traffic is being routed on the network or the application layer by using the OSI model mentioned earlier. Traffic being routed on the network layer is found on Layer 4, while the application layer is found in Layer 7. This helps the load balancer to make a decision on which server will receive an incoming request.
Load Balancing Methods
Each load balancing method relies on a set of criteria, or algorithms, to determine which of the servers in a server farm gets the next request. Here are some of the most common load balancing methods:
- Round Robin Method – This method sorts incoming requests by rotating the first server in the server pool to the bottom after fielding a request. It then waits to for its next turn.
- Weighted Round Robin – With this method, each server’s weight is usually associated with the number of active connections that it contains. The higher the weight, the more requests it will receive.
- Least Connections – This method directs traffic to whichever server has the fewest number of active connections.
- Weighted Least Connections – A weight is added to a server depending on its capacity. Together with the least connection method, they determine the load allocated to each server.
- Source IP Hash – Source IP hash uses the source and destination IP address of the client and server to generate a unique hash key. The key is then used to allocate the client to a particular server.
- Least Response Time – The back-end server with the least number of active connections and the least average response time is selected with this method.
- Least Pending Request – The pending requests are monitored and efficiently distributed across the most available servers.
Load Balancing Optimization Tips
If you want to make sure that your web application runs perfectly well with your load-balanced setup, you can make some optimizations:
- Network & Application Layer Optimizations – As mentioned earlier, the load balancing methods base their decisions on the layer that the traffic is being routed to. L4 load balancing can be routed faster than Layer 7 because L7 has to inspect the data passing through. But, L7 load balancing offloads slow connections making it better for performance.
- Session Persistence – Least Connections works well with configurations that rely on Traffic Pinning and/or Session Persistence because you can keep caching a user session’s data.
- SSL Decryption – This is the process of decrypting traffic at scale and routing it to various inspection tools that identify threats inbound to applications, as well as outbound from users to the internet.
- DNS Load Balancing – The DNS system sends a different version of the list of IP addresses each time it responds to a new client request using the round-robin method, therefore distributing the DNS requests evenly to different servers to handle the overall load.
If you want a taste of what a load balancer can do, it doesn’t hurt to try out some of the leading companies who are bringing load balancing to the forefront. Among these is Scale Arc, which serves as a database load balancing software that provides continuous availability at high-performance levels for mission-critical database systems deployed at scale.
The ScaleArc software appliance is a database load balancer. It enables database administrators to create highly available, scalable — and easy to manage, maintain, and migrate — database deployments. ScaleArc works with Microsoft SQL Server and MySQL as an on-premise solution, and within the cloud for corresponding PaaS and DBaaS solutions, including Amazon RDS or AzureSQL.
- High availability database environments
- Ensure zero downtime during database maintenance, and reduce risk of unplanned outages by automating failover processes and intelligently redirecting traffic to database replicas
- Effectively balance read and write traffic to improve overall database throughput dramatically
- Consolidate database analytics into a single platform allowing administrators and production support to make more efficient and intelligent decisions, thus saving time and money
- Seamlessly migrate to the cloud and between the platforms with zero downtime.
Try ScaleArc or read our whitepaper to find out if ScaleArc is for you.