As a senior cloud engineer who has architected solutions across multiple cloud platforms, I’ve learned that choosing the right load balancer can make or break your application’s performance, scalability, and cost efficiency. This guide dives deep into the load balancing offerings from AWS, GCP, and Azure, explaining how they work internally and when to use each type.

Understanding Load Balancer Fundamentals

Before we dive into specific offerings, let’s understand what happens inside a load balancer. At its core, a load balancer is a reverse proxy that distributes incoming traffic across multiple backend targets based on various algorithms and health checks.

AWS Load Balancers

AWS offers four types of load balancers, each designed for specific use cases.

1. Application Load Balancer (ALB)

OSI Layer: Layer 7 (Application)

Internal Architecture: ALB operates at the HTTP/HTTPS level, parsing request headers, paths, and host information to make intelligent routing decisions. It maintains connection pooling to backend targets and handles SSL/TLS termination efficiently.

When to Use ALB:

Key Features:

Pricing Model: Hourly rate + LCU (Load Balancer Capacity Units) based on new connections, active connections, processed bytes, and rule evaluations.

2. Network Load Balancer (NLB)

OSI Layer: Layer 4 (Transport)

Internal Architecture: NLB operates at the TCP/UDP level and uses flow hash algorithm (source IP, source port, destination IP, destination port, protocol) to route connections. It preserves source IP addresses and can handle millions of requests per second with ultra-low latency.

When to Use NLB:

Key Features:

Pricing Model: Hourly rate + NLCU (Network Load Balancer Capacity Units) based on processed bytes and connections.

3. Gateway Load Balancer (GWLB)

OSI Layer: Layer 3 (Network)

Internal Architecture: GWLB operates using GENEVE protocol encapsulation on port 6081. It transparently passes traffic to third-party virtual appliances for inspection before forwarding to destinations.

When to Use GWLB:

Key Features:

4. Classic Load Balancer (CLB)

Status: Legacy (not recommended for new applications)

When to Use CLB: Only for existing EC2-Classic applications. Migrate to ALB or NLB for new workloads.

Google Cloud Platform Load Balancers

GCP takes a different architectural approach with its global load balancing infrastructure, offering both global and regional load balancers.

1. Global External Application Load Balancer

OSI Layer: Layer 7

Internal Architecture: Built on Google’s global network infrastructure using Andromeda (Google’s SDN stack) and Maglev (consistent hashing load balancer). Traffic enters at Google’s edge locations closest to users and is routed through their private backbone.

When to Use:

Key Features:

2. Global External Network Load Balancer

OSI Layer: Layer 4

Internal Architecture: Uses Google’s Maglev for consistent hashing across backend pools. Supports both Premium Tier (global) and Standard Tier (regional) networking.

When to Use:

3. Regional External Application Load Balancer

OSI Layer: Layer 7

When to Use:

4. Internal Application Load Balancer

OSI Layer: Layer 7

Internal Architecture: Implemented using Google-managed Envoy-based proxies (not user-configurable), providing L7 features similar to Envoy but without direct proxy control. Fully distributed with no single point of failure.

When to Use:

Key Features:

5. Internal Network Load Balancer

OSI Layer: Layer 4

When to Use:

Azure Load Balancers

Azure offers load balancing solutions integrated with its global network fabric and supports both PaaS and IaaS workloads.

1. Azure Application Gateway

OSI Layer: Layer 7

Internal Architecture: Regional service that acts as an Application Delivery Controller (ADC) with integrated Web Application Firewall. Uses round-robin by default with session affinity options.

When to Use:

Key Features:

SKUs: Standard_v2 and WAF_v2 (v1 is being retired)

2. Azure Load Balancer

OSI Layer: Layer 4

Internal Architecture: Zone-redundant by default in regions that support availability zones. Uses 5-tuple hash (source IP, source port, destination IP, destination port, protocol) for distribution.

When to Use:

Key Features:

SKUs: Basic (being retired) and Standard

3. Azure Front Door

OSI Layer: Layer 7 (Global)

Internal Architecture: Operates at Microsoft’s global edge network with over 185 edge locations. Uses anycast protocol with split TCP and HTTP acceleration.

When to Use:

Key Features:

Tiers: Standard and Premium (includes enhanced security features)

4. Azure Traffic Manager

OSI Layer: DNS level (not a proxy)

Internal Architecture: DNS-based traffic routing service that responds to DNS queries with the IP of the optimal endpoint based on routing method.

When to Use:

Routing Methods:

Important Limitation: Traffic Manager only provides DNS resolution, not proxy functionality. Clients connect directly to endpoints after DNS resolution.

Comparison Matrix

Layer 7 (Application) Load Balancers

Feature

AWS ALB

GCP Global ALB

GCP Regional ALB

GCP Internal ALB

Azure App Gateway

Azure Front Door

Scope

Regional

Global

Regional

Regional (Internal)

Regional

Global

SSL Termination

Path Routing

Host Routing

Integrated WAF

✓ (Cloud Armor)

✓ (Cloud Armor)

✓ (OWASP)

✓ (Premium tier)

HTTP/2 & gRPC

WebSocket

Serverless Support

✓ (Lambda)

✓ (Cloud Run)

✓ (Cloud Run)

✓ (Cloud Run)

✓ (Functions)

✓ (Functions)

Built-in CDN

Global Anycast IP

Typical Latency

5-10ms

5-15ms

5-10ms

5-10ms

10-20ms

10-30ms*

Best For

Microservices, APIs

Global web apps

Regional web apps

Internal services

Web + WAF

Global CDN + LB

*Varies by user proximity to edge

Disclaimer: Latency figures are illustrative order-of-magnitude estimates observed in typical internet-facing deployments; actual latency varies significantly by region, traffic path, and backend proximity.

Layer 4 (Transport) Load Balancers

Feature

AWS NLB

GCP Global NLB

GCP Internal NLB

Azure Load Balancer

Scope

Regional

Global

Regional (Internal)

Regional

Protocol

TCP/UDP/TLS

TCP/UDP

TCP/UDP

TCP/UDP

Static IP

✓ (Anycast)

Preserve Source IP

TLS Termination

WebSocket-compatible (TCP pass-through)

Connection Handling

Millions/sec

Millions/sec

High throughput

High throughput

Typical Latency

<1ms

1-2ms

1-2ms

~1–5ms (region and topology dependent)

Zone Redundant

Best For

High performance, gaming, IoT

Global TCP/UDP apps

Internal TCP/UDP

VM load balancing, RDP/SSH

Disclaimer: Latency figures are illustrative order-of-magnitude estimates observed in typical internet-facing deployments; actual latency varies significantly by region, traffic path, and backend proximity.

NOTE: Azure Load Balancer – TLS termination requires Application Gateway or Front Door.

Specialized Load Balancers

Feature

AWS GWLB

Azure Traffic Manager

Type

Layer 3 Gateway

DNS-based routing

Primary Use

Security appliance insertion

Global DNS routing

Protocol

GENEVE encapsulation (UDP 6081)

DNS resolution

Routing Method

Flow hash to appliances

Priority, Weighted, Geographic, Performance

Preserves Traffic

✓ (Transparent)

N/A (DNS only)

Integrated WAF

✗ (Routes to 3rd party)

Failover

Automatic (via health checks)

Automatic (via health checks)

Typical Latency

1-3ms

DNS resolution only

Best For

Centralized security inspection, IDS/IPS

Low-cost global failover, hybrid cloud

Disclaimer: Latency figures are illustrative order-of-magnitude estimates observed in typical internet-facing deployments; actual latency varies significantly by region, traffic path, and backend proximity.

Quick Decision Guide

Choose Layer 7 (ALB/App Gateway/Front Door) when:

Choose Layer 4 (NLB/Load Balancer) when:

Choose Specialized (GWLB/Traffic Manager) when:

Provider Strengths Summary

AWS:

GCP:

Azure:

Performance Characteristics

Decision Framework

Use Case: E-commerce Website

Requirements: Global presence, WAF protection, path-based routing, SSL termination

Recommended Solution:

Use Case: Real-time Gaming

Requirements: Ultra-low latency, UDP support, static IPs, preserve source IP

Recommended Solution:

Use Case: Internal Microservices

Requirements: Service mesh, advanced routing, circuit breaking, internal only

Recommended Solution:

Use Case: Hybrid Cloud Application

Requirements: On-premises and cloud, DNS-based routing, health monitoring

Recommended Solution:

Cost Optimization Tips

  1. Right-size your load balancer: Don’t use Application Load Balancers for simple TCP traffic; Network Load Balancers are often cheaper and more performant.
  2. Minimize cross-region data transfer: Use regional load balancers when global distribution isn’t required to avoid cross-region charges.
  3. Leverage autoscaling: Configure proper scaling policies to avoid over-provisioning during low-traffic periods.
  4. Use connection pooling: For HTTP traffic, connection pooling reduces overhead and improves efficiency.
  5. Monitor your LCUs/NLCUs: In AWS, understanding what drives your capacity unit consumption helps optimize costs.
  6. Consider Traffic Manager for Azure: For simple DNS-based routing, Traffic Manager is significantly cheaper than Front Door or Application Gateway.

Internal Working: Connection Flow Deep Dive

Let’s examine what happens internally when a request hits a Layer 7 load balancer:

Monitoring and Observability

Each cloud provider offers comprehensive metrics:

AWS CloudWatch Metrics:

GCP Cloud Monitoring:

Azure Monitor:

Final Recommendations

  1. Start with managed services: All three providers offer robust managed load balancing. Don’t build your own unless you have very specific requirements.
  2. Understand the OSI layer: Match your load balancer to your traffic type. Layer 7 for HTTP/HTTPS, Layer 4 for other TCP/UDP.
  3. Plan for global scale: If you’re building for a global audience, GCP’s global load balancer or Azure Front Door provides the best out-of-the-box experience.
  4. Security first: Always enable WAF protection for internet-facing applications using AWS WAF, Cloud Armor, or Azure WAF.
  5. Monitor everything: Set up comprehensive monitoring and alerting on health check failures, latency increases, and capacity limits.
  6. Test failover: Regularly test your disaster recovery and failover scenarios to ensure load balancers route traffic correctly when backends fail.

The landscape of cloud load balancers continues to evolve with new features and capabilities. As a senior engineer, staying current with these offerings and understanding their internal architecture allows you to make informed decisions that balance performance, cost, and operational complexity.

Appendix

Flow Hash Algorithm

What is Flow Hash?

Flow hash is a deterministic algorithm used by Layer 4 load balancers to distribute connections across backend servers. The key principle is that packets belonging to the same connection (or “flow”) always route to the same backend server, ensuring session persistence without requiring the load balancer to maintain state.

The 5-Tuple Hash

The most common flow hash implementation uses a 5-tuple hash, which creates a unique identifier from five components of each packet:

How Flow Hash Works Step-by-Step

Step 1: Extract the 5-Tuple

When a packet arrives, the load balancer extracts:

Step 2: Compute the Hash

These five values are concatenated and passed through a hash function:

Input: 192.168.1.100:54321 -> 10.0.1.50:443 (TCP) Hash Function: CRC32 or similar Output: 2746089617 (0xA3F2B891 in hex)

Step 3: Select Backend Using Modulo

If we have 4 backend servers, we use modulo operation:

Backend Index = Hash Value % Number of Backends 
Backend Index = 2746089617 % 4 = 1

Therefore, this connection routes to Backend Server 1.

Connection Flow with Hash Consistency

Advantages of Flow Hash

  1. Stateless: Does not maintain application-layer session state; tracks flows at the transport layer for consistency.
  2. Fast: Hash computation is extremely fast (nanoseconds)
  3. Scalable: Can handle millions of flows without memory overhead
  4. Session Persistence: All packets in a flow go to the same backend
  5. Symmetric: Can be used for both directions if needed

Challenges and Solutions

Problem 1: Backend Changes

When backends are added or removed, many flows get remapped:

Solution: Consistent Hashing

Google’s Maglev and other modern load balancers use consistent hashing to minimize flow disruption:

With consistent hashing, only flows near the removed/added backend are affected, not all flows.

Problem 2: Uneven Distribution

Simple modulo can create uneven distribution with certain hash functions:

Solution: Use high-quality hash functions like MurmurHash3 or xxHash that provide uniform distribution.

Real-World Example: AWS NLB

AWS Network Load Balancer uses flow hash with these characteristics:

Variations of Flow Hash

3-Tuple Hash (Less common):

Use case: When you want all connections from a client to go to the same backend regardless of port.

2-Tuple Hash:

Use case: Geographic affinity or client-based routing.


GENEVE Protocol Encapsulation

What is GENEVE?

GENEVE (Generic Network Virtualization Encapsulation) is a tunneling protocol designed for network virtualization. It’s the successor to VXLAN and NVGRE, providing a flexible framework for encapsulating network packets.

GENEVE is used by AWS Gateway Load Balancer (GWLB) to transparently route traffic through security appliances like firewalls, IDS/IPS systems, and network packet brokers.

GENEVE Packet Structure

GENEVE Header Format

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver|  Opt Len  |O|C|    Rsvd.  |          Protocol Type        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Virtual Network Identifier (VNI)       |    Reserved   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Variable Length Options                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Field Breakdown:

How AWS GWLB Uses GENEVE

GENEVE Encapsulation Example

Original Packet (HTTP Request):

Ethernet: [Src MAC: Client] [Dst MAC: Gateway] 
IP:       [Src: 203.0.113.50] [Dst: 10.0.2.50] 
TCP:      [Src Port: 54321] [Dst Port: 80] 
Data:     GET /index.html HTTP/1.1...

After GENEVE Encapsulation by GWLB:

Outer Ethernet: [Src MAC: GWLB ENI] [Dst MAC: Firewall ENI] 
Outer IP:       [Src: 10.0.0.50] [Dst: 10.0.1.100] 
Outer UDP:      [Src Port: Random] [Dst Port: 6081] 
GENEVE Header:  [VNI: 12345] [Protocol: Ethernet] 
Inner Ethernet: [Src MAC: Client] [Dst MAC: Gateway] 
Inner IP:       [Src: 203.0.113.50] [Dst: 10.0.2.50] 
Inner TCP:      [Src Port: 54321] [Dst Port: 80] 
Inner Data:     GET /index.html HTTP/1.1...

Why GENEVE for Security Appliances?

GENEVE Options and Metadata

GENEVE supports variable-length options for carrying metadata:

GENEVE vs VXLAN vs GRE

Feature

GENEVE

VXLAN

GRE

Encapsulation Protocol

UDP (Port 6081)

UDP (Port 4789)

IP Protocol 47

Header Overhead

8 bytes + options

8 bytes fixed

4+ bytes

Flexibility

Variable options

Fixed header

Limited options

VNI/VSID Size

24 bits

24 bits

32 bits (key)

Metadata Support

Extensive via options

Limited

Very limited

Industry Adoption

Growing (AWS, VMware)

Widespread

Legacy

Security Appliance Configuration

For an appliance to work with GWLB, it must:

  1. Listen on UDP port 6081 for GENEVE traffic
  2. Decapsulate GENEVE headers to access inner packet
  3. Inspect the inner packet according to security policies
  4. Re-encapsulate the packet with GENEVE
  5. Send back to GWLB at the source IP of the GENEVE packet

Example Packet Flow in Appliance:

Performance Considerations

Overhead Analysis:

Original Packet: 1500 bytes 
  ├─ Ethernet: 14 bytes 
  ├─ IP: 20 bytes 
  ├─ TCP: 20 bytes 
  └─ Data: 1446 bytes

GENEVE Encapsulated: 1558 bytes 
  ├─ Outer Ethernet: 14 bytes 
  ├─ Outer IP: 20 bytes 
  ├─ Outer UDP: 8 bytes 
  ├─ GENEVE: 16 bytes (8 base + 8 options) 
  └─ Inner Original Packet: 1500 bytes

Overhead: 58 bytes (3.9%)

Throughput Impact:

Debugging GENEVE

Capture GENEVE traffic with tcpdump:

# Capture GENEVE packets
tcpdump -i eth0 udp port 6081 -vvv -X

# Decode GENEVE with specific filter
tcpdump -i eth0 'udp port 6081' -vvv -XX

Sample GENEVE packet capture:

IP 10.0.0.50.54321 > 10.0.1.100.6081: UDP, length 1524
    0x0000:  4500 060c 1234 0000 4011 xxxx 0a00 0032  [email protected]
    0x0010:  0a00 0164 d431 17c1 05f8 xxxx 0000 6558  ...d.1........eX
    0x0020:  0030 3900 0000 0000 ... (GENEVE header)
    0x0030:  ... (Inner Ethernet frame)

Real-World Use Case: Centralized Egress Filtering

In this architecture:

  1. All VPCs route traffic to GWLB via Transit Gateway
  2. GWLB encapsulates with GENEVE and distributes to firewall pool using flow hash
  3. Firewalls inspect and return to GWLB
  4. GWLB decapsulates and routes to NAT Gateway/Internet
  5. Scaling firewalls doesn’t disrupt existing flows (consistent hashing)

Summary

Flow Hash Algorithm:

GENEVE Protocol:

Both technologies work together in systems like AWS GWLB: flow hash ensures consistent routing to appliances, while GENEVE enables transparent inspection without breaking application layer contexts.


Have questions about specific load balancer configurations or migration strategies? Feel free to reach out or leave a comment below.