Load balancing in Cloud Computing

load balancing in cloud computing
Reading time: 11 min

Cloud computing is an efficient substitute information technology archetype having its on-demand resource provisioning and high dependability. This technology has the latent to provide users with virtualized, distributed, and elastic resources as utilities. Cloud computing offers abundant styles of resources means by linking to an enormous pool of systems. This uses the scheduling of resources as well as the load balancing for the seamlessly file sharing in the infrastructure. A cloud computing model is only effectual if its resources are employed in the very best possible way and such an effective consumption can be attained by properly managing the resources. Resource management can be best attained if all the resources are handled with the very best efficiency and are scalable as per the requirement.

Cloud computing is deliberately used to interchange the prevailing physically accommodated technological infrastructure. The primary and foremost desire is of equivalence, that the services provided to the user should be equivalent to the platform that they are migrating from. These can be termed broadly in three types, security equivalence-to receive the technical service with at least the level of security equivalent to localhost. Since your cloud provider could also be having servers globally and can face security breach risks. Availability equivalence-to receive the technical service available to the time equivalent to the localhost. Latency equivalence-to receive the technical service without much latency as a minimum of equivalent to localhost. This infers guaranteeing that the end-user notices no further performance lag while using the services.

However, due to its huge data management property, the foremost issue that technology is facing is load unbalancing. Load unbalancing issue is a multivariant, multi-constraint problem that worsens the performance and efficiency of the resources together with the quality of service (QoS) that is certain on arranged SLA amongst consumer and provider.

Load balancing is referred to redistributing of labor of the incoming network traffic in an exceedingly dispersed system like cloud computing safeguarding no computing device or resource is overburdened, underloaded, or idle at any time. Load balancing techniques supply the answer for the load unbalancing for two objectionable facts- overloading and underloading. Load balancing in cloud computing is free of the platform it is working on; it is feasible on both machine level or VM level.

Today many modern websites that have a huge user base leading to a very high-traffic must fulfill the requests from the user quickly and reliably. For achieving these goals cost-effectively generally more servers are added. To work best with those servers, they require load balancing.

Load balancer

Load balancers manage the stream of information between the server and the endpoint devices/user. The server can be in the same location, on the general data center, or in any public cloud. The server is both physical or virtualized. It helps server transfer data proficiently, optimize how the application is delivered resources, and prevents server overload. Load balancers check the servers continuously to confirm they will handle requests from the user. If needs must, it can remove the server providing less efficiency until they are restored. Some load balancers based on the amplified demand can even prompt the creation of virtualized application servers.

A load balancer is a physical device, a virtualized instance running on specialized hardware or software, incorporated into application delivery controllers (ADC) (Fig 1) designed to boost the overall quality of the system and security of three-tier web and microservices-based applications regardless where they’re hosted.

load balancer
Fig 1: Load balancers checks the condition of servers and refrain from sending requests to those who are unable to function properly.

A load balancer acts mostly as a traffic cop that sits before your servers and routes the client request across the foremost suited servers that are capable of fulfilling the request with maximizing the speed and using all resources efficiently and ensuring that no server is overloaded to confirm that the performance doesn’t degrade. No matter if it is hardware-based, software-based or whatever algorithm it’s using it effectively minimizes server latency time and maximizes throughput.

In the seven-layer Open System Interconnection (OSI) model, network firewalls are at the primary three levels meanwhile, load balancing happens between layers four to seven (Fig 2). This helps to sustain with the ever-evolving user requests, server resources to be on hand.

L4- load balancer works at the transport level directing the traffic supporting TCP or UDP protocols.

L7- load balancer acts at the application level adding content switching that helps them to evaluate HTTP headers and SSL session IDs while helping in distributing the request to different servers.

GSLB- global server load balancing encompasses the above layers’ proficiencies to servers in numerous geographical positions.

tcp ip vs osi model
Fig 2: Load balancers can even handle traffic based on the request. For instance, any request that features “contacts” is sent to the server that is responsible for the delivery of contacts.

The load balancer runs as both hardware appliances or is software-defined. Hardware-based load balancers have characteristically high-performance appliances and it may comprise a built-in virtualization technique that may join numerous virtual load balancer instances on the identical hardware. On the contrary, a software-based load balancer can entirely substitute hardware-based with keeping all the functionalities. It can run on common hypervisors reducing the space overhead and reducing hardware expenditures.

software load balancer

software pros and cons hardware pros and cons

Load balancing algorithms

A load balancer, or the ADC that features it, follows some set of procedures to determine how the load request is being distributed among the servers. Different load balancing algorithms are providing different benefits.

Round Robin algorithm for load balancing

Round-Robin algorithm for load balancing is counted in one among the foremost viable and simple to use the technique for distributing user requests among the geographically distributed cloud servers. It’s a straightforward technique to make sure every user request is forwarded to a virtual server based on a circling list. It starts the search from the primary server within a group of servers and re-distributes the resource call to every server turn by turn (Fig 3). On termination of a list, it comes back and re-iterates the method with the second list of servers and then on.

The main advantage of this algo. is that it’s much easier to execute on the server. However, for the limitation, this does not take account of the servers being previously busy as it treats them equally, which means it thinks they have the same configuration. The advancement to the present algorithm takes account of this problem and check out to beat it leading in competent load balancing.

Round-robin scheduling strategy
Fig 3: Round-robin scheduling strategy

Weighted Round-Robin algorithm (WRR)- it is a modified version of the Round-robin algorithm in which some number as weight is allotted to the servers available. The weight is assigned on the premise of pre-defined criteria having a numerical value. The load handling capabilities of a server in any instant is considered as the most recognized criterion. Meaning the peak number of values is fixed to that server which can take the peak number of user’s wishes and the other way around.

Dynamic Round-Robin algorithm (DRR) – it is one more variant of the round-robin algorithm within which a numeral is allotted at dynamic runtime to the server constructed on the arrangement of the server and how much load it can handle. The weight isn’t constant and changes in real-time, if the server does not meet the adequate resources count it drops.

Active monitoring-based Load Balancing algorithm

An active monitoring-based load-balancing algorithm is one more load balancing algorithm. It distinguishes from others because it preserves a statistics table containing info about each VMs and the agreeing user request that they are fulfilling currently (Fig 4).

Active monitoring-based load balancing method
Fig. 4: Active monitoring-based load balancing method

When a user request is sent towards the load balancer it searches the table for the VM that is least occupied, if several VM is accessible than the primary search is allotted the job. Among other things, VM id is set to retain info of the servers that are allotted with the present task. This info is again stored, and the information table is restructured with the same.

Throttled load balancing algorithm

The throttled load-balancing algorithm is grounded on the intangible outline of VM. In this algorithm, every request that is produced is progressed to load balancer to find the ideal VM which can deal with the demand efficiently and effectually. In this procedure, numerous user actions are reinforced that have to be taken into consideration during service provisioning (Fig 5).

Throttled load balancing algorithm
Fig. 5: Throttled load balancing algorithm

In this, an index table is kept containing all the available VMs likewise as their corresponding state, i.e. accessible or busy. The client or server prepares the load balancer by producing a fitting request to the data center to search out the appropriate VM for the service preparation. The data center formulates a list of all the requests that arrived at the server for the preparation of VM. An ideal scan is achieved by the load balancer initiating from the uppermost index list till an apt VM is found or Index list is glance over entirely. When the fitting VM is found, the data center dispenses the task to it recognized by its VM id.

Related work

Modified throttled load balancing algorithm

The projected model of the improved throttled algorithms (Fig 6). The primary emphases of this model are on the number of inbound requests that are at present been dispensed to a specific VM. In the projected model a modified version of a conventional throttled load-balancing algorithm is offered.

Proposed modified throttled algorithm
Fig. 6  Proposed modified throttled algorithm

In the projected technique, a table is preserved in real-time indicating the machine’s state for each server. The state of the machine is estimated on the premise of the probability of accessibility of a certain VM, i.e. the availability index (AI). The AI value is demarcated as several requests targeted a specific VM in a certain duration of time. Higher the value in the AI lesser the odds of availability and vice-versa. Each request that is created is engaged based on this AI value.

Experiment simulations

The simulation was achieved utilizing a cloud computing evaluation tool, named as cloud analyst. Cloud analyst is one of the well-known graphical user interface (GUI) grounded on CloudSim architecture. CloudSim allows one to create a model, simulation, and experimentation based on computation. Additionally, this tool permits manifold simulations with nominal alterations in the parameter for the resourceful evaluation of the model. Based on simulations, a graph can be designed concerning time and cost functions from the results that are crucial in the inclusive assessment of an algorithm.

avearge response time

average data center request service comparison of load balancing services


After carrying out various simulations with adjustable parameters, results are calculated and kept in Tables 1, 2, and 3. The configurations that are already defined in the simulation are used for each policy. In these simulations, the results are calculated as shown in Figs. 7, 8, and 9. Parameters like average reaction time, data center service time, and total cost of different data centers have been taken into account for the analysis purpose.

Average response time results
Fig. 7 Average response time results


Fig. 8 Average data center request servicing time
Fig. 8 Average data center request servicing time
Overall comparison of load balancing policies
Fig. 9 Overall comparison of load balancing policies


Hybridization of the meta-heuristic algorithm for load balancing in the cloud computing environment

Cloud computing consist of a set of VMs and each VM is responsible for scheduling and balancing the load by allocating VMs to servers during utilization of load-balanced overall servers.

Q-learning algorithm

Q-learning is one of the reinforcement learning algorithms in the area of machine learning which allows the agent to learn in an environment and perform the action of giving reward or penalty based on the feedback received from the environment.

Consider there are set of states S = {s1, s2, … } in the environment and each state have set of actions A = {a1, a2, … }. An agent selects an action    A at time t in the state     S to transit to the next state     S through the transition process and acquires an instantaneous reward from the situation.

It is necessary to select appropriate action that maximizes the Q-value of each state which is the main aim of finding an optimal policy in the cloud network. The Q-value function depends on what is the selection criterion of action in a particular state. Consider the agent in the state  and select an action  which is expected to move best next state and maximize the total expected reward in the environment, then Q-value is calculated as follows:

formula for qlearning

formula for qlearning

For reward calculation:

formula for qlearning

Algorithm for Q-Learning

Algorithm for Q-Learning


The work represents that load balancing is very important in aspects of smooth execution for cloud computing. While load balancing algorithms are quite efficient and they take the account that the workload can be distributed at both compile-time or runtime. We have taken into account several algorithms and how the use of other technologies like reinforcement learning can increase efficiency and decrease the time gap in handling the request.

-Tanuj Bhatt


Leave a Reply

Your email address will not be published. Required fields are marked *


cloud monitoring services -

Exploring Cloud Applications: Monitoring Services

security issues in fog or edge devices

Security and Privacy related issues in fog/edge computing