Load balancing

Network Load Balancer components

A load balancer serves as the single point of contact for
clients. The load balancer distributes incoming traffic across multiple targets, such
as
Amazon EC2 instances. This increases the availability of your application. You add
one or
more listeners to your load balancer.

A listener checks for connection requests from clients, using the
protocol and port that you configure, and forwards requests to a target group.

Each target group routes requests to one or more registered
targets, such as EC2 instances, using the TCP protocol and the port number that you
specify. You can register a target with multiple target groups. You can configure
health
checks on a per target group basis. Health checks are performed on all targets
registered to a target group that is specified in a listener rule for your load
balancer.

For more information, see the following documentation:

How does a load balancer work?

A load balancer is a reverse proxy. It presents a virtual IP address (VIP) representing the application to the client. The client connects to the VIP and the load balancer makes a determination through its algorithms to send the connection to a specific application instance on a server. The load balancer continues to manage and monitor the connection for the entire duration.

Imagine a sports agent negotiating a new contract for a star athlete. The agent takes the request from the athlete and sends it to a specific interested team. The team responds with information (an offer) which the agent then passes back to the client. This goes on for a while until a resolution is reached.

This is the primary function of the load balancer, server load balancing (SLB). The agent can provide additional functionality based on their role in the conversation. They can decide to allow and/or deny certain details (security). They may want to validate that the person they are talking to is actually the athlete in question (authentication). If the current sports league is not working out, the agent can send the discussions to a different league based on availability or location (GSLB).

3.2 Enable load balancing

To enable load balancing

On the configured DirectAccess server, click Start, and then click Remote Access Management. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.
In the Remote Access Management console, in the left pane, click Configuration, and then in the Tasks pane, click Enable Load Balancing.
In the Enable Load Balancing Wizard, click Next.
Depending on what you chose in planning steps:
1. Windows NLB: On the Load Balancing Method page, click Use Windows Network Load Balancing (NLB), and then click Next.
2. External load balancer: On the Load Balancing Method page, click Use an external load balancer, and then click Next.
In a single network adapter deployment, on the Dedicated IP Addresses page, do the following, and then click Next:
1. In the IPv4 address box, enter the new IPv4 address for this Remote Access server; the current IPv4 address will be the virtual IP address (VIP) of the load-balanced cluster. In the Subnet mask box, enter the subnet mask.
2. If the corporate environment is native IPv6, then in the IPv6 address box, enter the new IPv6 address for this Remote Access server; the current IPv6 address will be the VIP of the load-balanced cluster. In the Subnet prefix length box, enter the subnet prefix length.
In a two network adapter deployment, on the External Dedicated IP Addresses page, do the following, and then click Next:
1. In the IPv4 address box, enter the new external IPv4 address for this Remote Access server; the current IPv4 address will be the virtual IP address (VIP) of the load balancing cluster. In the Subnet mask box, enter the subnet mask.
2. If there are currently native IPv6 addresses configured on the internet-facing network adapter of the Remote Access server, in the IPv6 address box, enter the new external IPv6 address for this Remote Access server; the current IPv6 address will be the VIP of the load balancing cluster. In the Subnet prefix length box, enter the subnet prefix length.
In a two network adapter deployment, on the Internal Dedicated IP Addresses page, do the following, and then click Next:
1. In the IPv4 address box, enter the new internal IPv4 address for this Remote Access server; the current IPv4 address will be the VIP of the load balancing cluster. In the Subnet mask box, enter the subnet mask.
2. If the corporate environment is native IPv6, then in the IPv6 address box, enter the new internal IPv6 address for this Remote Access server; the current IPv6 address will be the VIP of the load balancing cluster. In the Subnet prefix length box, enter the subnet prefix length.
On the Summary page, click Commit.
On the Enable Load Balancing dialog box, click Close.
In the Enable Load Balancing Wizard, click Close.

Note

If external load balancing is being used, note the Virtual IPs and provide them as on the external load balancers.

Windows PowerShell equivalent commands

The following Windows PowerShell cmdlet or cmdlets perform the same function as the preceding procedure. Enter each cmdlet on a single line, even though they may appear word-wrapped across several lines here because of formatting constraints.

If you chose to use Windows NLB in the planning steps, then execute the following:

If you chose to use an external load balancer in the planning steps: then execute the following:

Note

It is recommended to not include changes to load-balancer settings with changes to any other settings, if you are using staging GPOs. Any changes to load-balancer settings must be applied first and then other configuration changes should be made. Also, after configuring load-balancer on a new DirectAccess server, please allow some time for the IP changes to be applied and replicated across the DNS servers in the enterprise, before you change other DirectAccess settings related to the new cluster.

Load balancing in the cloud

Load balancing can either refer to the process of balancing cloud-based workloads or load balancers that are themselves based in the cloud. In a cloud environment, cloud balancing functions much the same as in other environments, except that it has to do with traffic related to a company’s cloud-based workloads and their distribution across multiple resources, such as server groups and networks.

Balancing cloud workloads is just as important as balancing loads in any other context. The objective ultimately is high availability and performance. The better the workloads perform as a result of even traffic distribution, the less likely the environment is to suffer an outage.

Cloud-based load balancers are usually offered in a pay-as-you-go, as-a-service model that supports high levels of elasticity and flexibility. They offer a number of functions and benefits, such as health checks and control over who can access which resources. This depends on the vendor and the environment in which you use them. Cloud load balancers may use one or more algorithms—supporting methods such as round robin, weighted round robin, and least connections—to optimize traffic distribution and resource performance.

Why load balancing is important

HTTP/2 multiplexes multiple calls on a single TCP connection. If gRPC and HTTP/2 are used with a network load balancer (NLB), the connection is forwarded to a server, and all gRPC calls are sent to that one server. The other server instances on the NLB are idle.

Network load balancers are a common solution for load balancing because they are fast and lightweight. For example, Kubernetes by default uses a network load balancer to balance connections between pod instances. However, network load balancers are not effective at distributing load when used with gRPC and HTTP/2.

Proxy or client-side load balancing?

gRPC and HTTP/2 can be effectively load balanced using either an application load balancer proxy or client-side load balancing. Both of these options allow individual gRPC calls to be distributed across available servers. Deciding between proxy and client-side load balancing is an architectural choice. There are pros and cons for each.

Proxy: gRPC calls are sent to the proxy, the proxy makes a load balancing decision, and the gRPC call is sent on to the final endpoint. The proxy is responsible for knowing about endpoints. Using a proxy adds:
- An additional network hop to gRPC calls.
- Latency and consumes additional resources.
- Proxy server must be setup and configured correctly.
Client-side load balancing: The gRPC client makes a load balancing decision when a gRPC call is started. The gRPC call is sent directly to the final endpoint. When using client-side load balancing:
- The client is responsible for knowing about available endpoints and making load balancing decisions.
- Additional client configuration is required.
- High-performance, load balanced gRPC calls eliminate the need for a proxy.

Пассивная проверка работоспособности

Пассивная проверка позволяет возвращать мертвые бекенды если они ожили или выявлять их независимо от запросов. Для этого мы пингуем бекенды через определенный интервал времени и проверяем их статусы.

В нашей реализации мы пытаемся установить TCP соединение. Если это удалось — бекенд считается живым. Эту логику можно поменять на вызов определенного ендпоинта, например /status

Важно не забывать закрывать соединение чтобы уменьшить нагрузку на сервер — иначе у сервера закончатся ресурсы

Теперь проходим по всем бекендам и определяем их статус

Запускаем эту проверку периодически с использованием таймера

Вызов возвращает значение каждые 20 секунд. позволяет отловить эти события — он блокируется пока в канале не появится новый элемент.

Запускаем весь этот код в отдельной горутине

Load balancer scheme

When you create a load balancer, you must choose whether to make it an internal load
balancer or an internet-facing load balancer. Note that when you create a Classic
Load Balancer in
EC2-Classic, it must be an internet-facing load balancer.

The nodes of an internet-facing load balancer have public IP addresses. The DNS name
of an internet-facing load balancer is publicly resolvable to the public IP addresses
of
the nodes. Therefore, internet-facing load balancers can route requests from clients
over the internet.

The nodes of an internal load balancer have only private IP addresses. The DNS name
of
an internal load balancer is publicly resolvable to the private IP addresses of the
nodes. Therefore, internal load balancers can only route requests from clients with
access to the VPC for the load balancer.

Both internet-facing and internal load balancers route requests to your targets using
private IP addresses. Therefore, your targets do not need public IP addresses to receive
requests from an internal or an internet-facing load balancer.

If your application has multiple tiers, you can design an architecture that uses both
internal and internet-facing load balancers. For example, this is true if your
application uses web servers that must be connected to the internet, and application
servers that are only connected to the web servers. Create an internet-facing load
balancer and register the web servers with it. Create an internal load balancer and
register the application servers with it. The web servers receive requests from the
internet-facing load balancer and send requests for the application servers to the
internal load balancer. The application servers receive requests from the internal
load
balancer.

Алгоритмы и методы распределения нагрузки

Ключевой элемент балансировки нагрузки на сервер — это применяемый алгоритм. Мы рассмотрим самые простые и популярные, которые часто применяются на практике. Эти методы позволяют сгладить нагрузку и снижают риск падения сайта во время непредвиденной пиковой посещаемости.

Общие принципы

Выбирая или разрабатывая алгоритм, нужно придерживаться трех принципов:

Справедливость. Каждый запрос должен обрабатываться. Нельзя допустить, чтобы запросы стояли в очереди друг за другом. Поэтому перед разработкой алгоритма балансировки нужно проверить нагрузку на сервер в динамике, чтобы знать, к каким скачкам нужно готовиться.
Рациональность. Все серверы из пула должны работать. Желательно — всегда и на полную мощность. Это не всегда достижимо, и задача алгоритма — распределить нагрузку максимально равномерно.
Скорость. Хороший балансирующий алгоритм обеспечивает быструю обработку запросов.

Round Robin

Самый простой способ снизить нагрузку на сервер — это отправлять каждый запрос поочередно. Предположим, у вас есть три VDS. Запросы направляются поочередно на первый, второй, а затем третий сервер. Следующий запрос снова вернется к первому серверу, и цикл начнется заново. Плюсы подхода очевидны: простота, дешевизна, эффективность. При этом серверы из пула могут не быть связаны между собой — через DNS и этот алгоритм можно перенаправлять запросы на любые машины. Главная проблема этого подхода — нерациональное распределение ресурсов. Даже если все машины обладают примерно одинаковыми характеристиками, реальная нагрузка будет сильно различаться в пуле.

Weighted Round Robin

Этот алгоритм аналогичен предыдущему, но он дополнительно берет во внимание производительность сервера. Чем мощнее машина, тем больше запросов она обрабатывает

Это не идеальный подход, но он значительно лучше обычного Round Robin.

Least Connections

Суть этого алгоритма проста: каждый последующий запрос направляется на сервер с наименьшим количеством поддерживаемых подключений. Least Connections — это изящное и эффективное решение, которое позволяет адекватно распределять нагрузку по серверам с приблизительно одинаковыми параметрами.

Sticky Sessions

В этом алгоритме запросы распределяются в зависимости от IP-адреса пользователя. Sticky Sessions предполагает, что обращения от одного клиента будут направляться на один и тот же сервер, а не скакать в пуле. Клиент сменит сервер только в том случае, если ранее использовавшийся больше не доступен.

What are load balancing algorithms?

Load balancing algorithms are formulas to determine which server to send each client connection to. The algorithms can be very simple, like round robin, or they can be advanced like agent based adaptive. No matter the case, the purpose of the algorithm is to send the client connection to the best suited application server.

The most commonly recommended algorithm is least connection. This algorithm is designed to send the connection to the best performing server based on the number of connections it is currently managing. Least connections takes into account the length of each connection by only looking at what is currently active on the server.

Types Of Load Balancing Algorithms

Least Connection
Round Robin
Weighted Round Robin
Weighted Least Connection
Agent Based Adaptive Load Balancing
Chained Failover (Fixed Weighted)
Weighted Response Time
Source IP Hash
Software Defined Networking (SDN) Adaptive

Which methods and algorithms are best?

All load balancers (Application Delivery Controllers) use the same load balancing methods. It’s very common for people to choose a particular method because it is what they were told to do, rather than because it’s actually right for their application.

Load balancers traditionally use a combination of routing-based OSI Layer 2/3/4 techniques (generally referred to as Layer 4 load balancing). All modern load balancers also support layer 7 techniques (full application reverse proxy). However, just because the number is bigger, that doesn’t mean it’s a better solution for you! 7 blades on your razor aren’t necessarily better than 4.

Layer 4 DR (Direct Routing)	Ultra-fast local server based load balancing.Requires handling the ARP issue on the real servers.
Layer 4 NAT (Network Address Translation)	Fast Layer 4 load balancing. The appliance becomes the default gateway for the real servers.
Layer 4TUN	Similar to DR but works across.IP encapsulated tunnels.
Layer 4LVS-SNAT	You can configure layer 4 to act as a reverse proxy, using IPTables rules.Very useful if you need to proxy UDP traffic.
Layer 7 SSL Termination	Usually required in order to process cookie persistence inHTTPS streams on the load balancer. Processor intensive.
Layer 7 SNAT (HAProxy)	Layer 7 allows great flexibility including full SNAT and WAN load balancing, HTTP or RDP cookie insertion and URL switching.

What types of load balancers are out there?

To understand the types of load balancers, one needs to understand the history.

Network Server Load Balancers

Load balancers entered the market in the mid-1990s to support the surge of traffic on the internet. Load balancers had basic functionality designed to pool server resources to meet this demand. The load balancer managed connections based on the packet header. Specifically, they looked at the 5-tuple – source IP, destination IP, source port, destination port, and IP protocol. This is the entry of the network server load balancer or Layer 4 load balancer.

Application Load Balancers

As technology evolved, so did the load balancers. They became more advanced and started providing content awareness and content switching. These load balancers looked beyond the packet header and into the content payload. These load balancers look at the content such as the URL, HTTP header, and other things to make load balancing decisions. These are the application load balancers or Layer 7 load balancers.

Global Server Load Balancing

Global server load balancing (GSLB) is actually a different technology than the traditional layer 4-7 load balancer. GSLB is based on DNS and acts as a DNS proxy to provide responses based on GSLB load balancing algorithms in real time. It is easiest to think of GSLB as a dynamic DNS technology that manages and monitors the multiple sites through configurations and health checks. Most load balancing solutions today offer GSLB as a component of their functionality.

Hardware vs Software vs Virtual Load Balancing

Load balancers originated as hardware solutions. Hardware provides a simple appliance that delivers the functionality with a focus of performance. Hardware-based load balancers are designed for installation within datacenters. They are turn-key solutions that do not require the dependencies that software-based solutions require such as hypervisors and COTS hardware.

As network technologies evolved, software-defined, virtualization, and cloud technologies have become important. Software-based load balancing solutions offer flexibility and the ability to integrate into the virtualization orchestration solutions. Some environments such as cloud require software solutions. Software-based environments often use DevOps and/or CI/CD processes. The software load balancer is more suited for these environments with their flexibility and integration.

Elastic Load Balancers

Elastic Load Balancer (ELB) solutions are far more sophisticated and offer cloud-computing operators scalable capacity based on traffic requirements at any one time. Elastic Load Balancing scales traffic to an application as demand changes over time. It also scales load balancing instances automatically and on-demand. As elastic load balancing uses request routing algorithms to distribute incoming application traffic across multiple instances or scale them as necessary, it increases the fault tolerance of your applications.

Network MTU for your load balancer

The maximum transmission unit (MTU) of a network connection is the size, in bytes,
of
the largest permissible packet that can be passed over the connection. The larger
the
MTU of a connection, the more data that can be passed in a single packet. Ethernet
packets consist of the frame, or the actual data you are sending, and the network
overhead information that surrounds it. Traffic sent over an internet gateway is limited
to 1500 MTU. This means that if packets are over 1500 bytes, they are fragmented,
or
they are dropped if the flag is set in the IP
header.

The MTU size on an Application Load Balancer, Network Load Balancer, or Classic Load
Balancer node is not configurable. Jumbo frames (MTU
9001) are standard across all load balancer nodes. The path MTU is the maximum packet
size that is supported on the path between the originating host and the receiving
host.
Path MTU Discovery (PMTUD) is used to determine the path MTU between two devices.
Path
MTU Discovery is especially important if the client or target does not support jumbo
frames.

When a host sends a packet that is larger than the MTU of the receiving host or larger
than the MTU of a device along the path, the receiving host or device drops the packet,
and then returns the following ICMP message: . This
instructs the transmitting host to split the payload into multiple smaller packets,
and
retransmit them.

If packets larger than the MTU size of the client or target interface continue to
be
dropped, it is likely that Path MTU Discovery (PMTUD) is not working. To avoid this,
ensure that Path MTU Discovery is working end to end, and that you have enabled jumbo
frames on your clients and targets. For more information about Path MTU Discovery
and
enabling jumbo frames, see
in the Amazon EC2 User Guide.

Delete Your Load Balancer

When your load balancer becomes available, you are billed for each hour that you keep it running. Once you no longer need a load balancer, you can delete it. When the load balancer is deleted, you stop incurring charges for it. Deleting a load balancer does not affect the backend servers or subnets used by the load balancer.

To delete your load balancer:

Open the navigation menu, click
Networking, and then click Load
Balancers.
Choose the Compartment that contains your load balancer.
Next to your load balancer, click the Actions menu, and then click Terminate.
Confirm when prompted.

If you want to delete the instances and VCN you created for this tutorial, follow the instructions in .

Source Network Address Translation (SNAT) load balancing method — layer 7

If your application requires the load balancer to handle cookie insertion then you need to use the SNAT configuration. This also has the advantages of a one-arm configuration and does not require any changes to the application servers. However, since the load balancer is acting as a full proxy it doesn’t have the same raw throughput as the routing-based methods.

The network diagram for the Layer 7 HAProxy SNAT mode is very similar to the Direct Routing example except that no re-configuration of the real servers is required. The load balancer proxies the application traffic to the servers so that the source of all traffic becomes the load balancer.

As with other modes, a single unit does not require a Floating IP.
SNAT is a full proxy and therefore load balanced servers do not need to be changed in any way.

Because SNAT is a full proxy, any server in the cluster can be on any accessible subnet including across the Internet or WAN. SNAT is not transparent by default, so the real servers will see the source address of each request as the load balancer’s IP address. The client’s source IP address will be in the X-Forwarded-For for header (see TPROXY method).

NB. Rather than messing around with TPROXY did you know you can load balance based on the X-Forwarded header? (pretty cool eh?)

Installation information

You can install NLB by using either Server Manager or the Windows PowerShell commands for NLB.

Optionally you can install the Network Load Balancing Tools to manage a local or remote NLB cluster. The tools include Network Load Balancing Manager and the NLB Windows PowerShell commands.

Installation with Server Manager

In Server Manager, you can use the Add Roles and Features Wizard to add the Network Load Balancing feature. When you complete the wizard, NLB is installed, and you do not need to restart the computer.

Installation with Windows PowerShell

To install NLB by using Windows PowerShell, run the following command at an elevated Windows PowerShell prompt on the computer where you want to install NLB.

After installation is complete, no restart of the computer is required.

For more information, see Install-WindowsFeature.

Network Load Balancing Manager

To open Network Load Balancing Manager in Server Manager, click Tools, and then click Network Load Balancing Manager.

Load balancing and networks

As with other load balancers, when a network load balancer receives a connection request, it chooses a target to make the connection. Some types of connections, such as when browsers connect to websites, require separate sessions for text, images, video, and other types of content on the webpage. Load balancing handles these concurrent sessions to avoid any performance and availability issues.

Network load balancing also provides network redundancy and failover. If a WAN link suffers an outage, redundancy makes it possible to still access network resources through a secondary link. The servers connected to a load balancer are not necessarily all in the same location, but if properly architected, this has no bearing on the load balancer’s ability to do its work.

Network load balancing delivers business continuity and failover, ensuring that no matter how geographically dispersed your organization’s workforce is, you will be able to maintain acceptable levels of availability and performance.

For an overview of networks, see «Networking: A Complete Guide.»

IBM Cloud Internet Services provides more information about global load balancing.

SSL Termination or Acceleration (SSL) with or without TPROXY

All of the layer 4 and Layer 7 load balancing methods can handle SSL traffic in pass through mode, where the backend servers do the decryption and encryption of the traffic. This is very scalable as you can just add more servers to the cluster to gain higher Transactions Per Second (TPS). However, if you want to inspect HTTPS traffic in order to read or insert cookies you will need to decode (terminate) the SSL traffic on the load balancer. You can do this by importing your secure key and signed certificate to the load balancer, giving it the authority to decrypt traffic. The load balancer uses standard apache/PEM format certificates.

You can define a Pound/Stunnel SSL virtual server with a single backend — either a Layer 4 NAT mode virtual server or more usually a Layer 7 HAProxy VIP — which can then insert cookies.

Pound/Stunnel-SSL is not transparent by default, so the backend will see the source address of each request as the load balancer’s IP address. The client’s source IP address will be in the X-Forwarded-For for header. However Pound/Stunnel-SSL can also be configured with TPROXY to ensure that the backend can see the source IP address of all traffic.