Skip to main content

Command Palette

Search for a command to run...

Understanding AWS Elastic Load Balancing for Reliable DevOps Solutions

Published
12 min read
Understanding AWS Elastic Load Balancing for Reliable DevOps Solutions
V
Hi there! I’m a DevOps enthusiast, certified in AWS and Terraform, passionate about crafting innovative cloud solutions. From designing scalable CI/CD pipelines to deploying microservices on cloud platforms, I’ve immersed myself in transforming ideas into impactful technologies.

Introduction

It's Monday morning. Your single web server is humming along, serving 100 requests per second. Then TechCrunch publishes an article about your product.

9:00 AM: 500 requests/second
9:15 AM: 2,000 requests/second
9:30 AM: Server CPU at 100%
9:32 AM: Server crashes
9:33 AM: Website down, error 503

Or imagine this: You have three servers behind a load balancer. One server develops a memory leak and starts responding slowly. Users randomly hit the slow server, get a poor experience, and complain about your "unreliable" service.

Both scenarios have the same root cause: lack of intelligent traffic distribution and health-aware routing.

This is where AWS Elastic Load Balancing (ELB) comes in - automatically distributing traffic across healthy instances, automatically scaling to handle millions of requests, and providing high availability without manual intervention.

This guide covers everything a DevOps engineer needs to know about AWS load balancing, from choosing the right type to production-ready configurations.

The Problem: Single Points of Failure and Manual Failover

Let's examine what happens without proper load balancing.

Challenge 1: Single Server Bottleneck

Scenario: E-commerce application on one EC2 instance

Normal Traffic (100 req/sec):

  • CPU: 20%

  • Memory: 40%

  • Response time: 50ms

Everything works fine

Black Friday (5,000 req/sec):

  • CPU: 100%

  • Memory: 95%

  • Response time: 5,000ms

  • Server crashes after 10 minutes

Result:

  • Lost sales during peak shopping time

  • Customers frustrated

  • Reputation damage

Why one server fails:

  • CPU bottleneck (can only process so many requests)

  • Memory exhaustion (too many concurrent connections)

  • Network bandwidth limit

  • No redundancy (server crashes = total outage)

Challenge 2: Manual Failover Complexity

Scenario: Three servers, one fails

  • Server 1 (10.0.1.10) - Healthy

  • Server 2 (10.0.1.11) - CRASHED

  • Server 3 (10.0.1.12) - Healthy

Without Load Balancer:

Users connecting to Server 2:

  • Connection timeout

  • Application error

  • Bad user experience

Manual Failover Process:

  1. Detect Server 2 is down (5-15 minutes)

  2. Update DNS / Route53 (manual)

  3. Wait for DNS propagation (5-60 minutes)

  4. Users gradually shift to healthy servers

Total recovery time: 10-75 minutes (assumptions)

Challenge 3: Uneven Traffic Distribution

Scenario: Manual round-robin DNS

DNS Round Robin:
example.com resolves to:
├── 10.0.1.10 (33% of clients)
├── 10.0.1.11 (33% of clients)
└── 10.0.1.12 (33% of clients)

Problems:
1. Client-side caching
   └── Users stuck on one IP (no redistribution)

2. No health awareness
   └── DNS returns crashed server IPs

3. No session affinity
   └── User session lost if switched servers

4. Can't handle varying instance sizes
   └── t3.micro gets same traffic as m5.xlarge

Challenge 4: SSL/TLS Termination Overhead

Without a load balancer:

Each server handles SSL:

  • CPU overhead for TLS handshake

  • Certificate management on all servers

  • Cipher suite configuration per server

  • Certificate renewal on all servers

With 10 servers:

  • 10× certificate installations

  • 10× configurations to maintain

  • 10× points of failure

Challenge 5: Zero-Downtime Deployment Difficulty

Rolling deployment without a load balancer:

Manual Process:

  1. Remove Server 1 from pool (manually)

  2. Deploy new version to Server 1

  3. Test Server 1

  4. Add Server 1 back to pool

  5. Repeat for Server 2, 3, 4...

Issues:

  • Manual coordination required

  • Risk of human error

  • Slow deployment process

  • If deployment fails midway, an inconsistent state

What is AWS Elastic Load Balancing?

AWS Elastic Load Balancing (ELB) automatically distributes incoming traffic across multiple targets (EC2 instances, containers, IP addresses, Lambda functions) in multiple Availability Zones.

The Value Proposition

Load Balancer Types

AWS offers three types of load balancers:

When to use each:

Application Load Balancer (ALB):

  • Modern web applications (HTTP/HTTPS)

  • Microservices architectures

  • Container-based applications

  • Lambda functions

  • Content-based routing needed

  • Recommendation: Use this by default

Network Load Balancer (NLB):

  • Extreme performance required (millions of requests/sec)

  • Ultra-low latency needed (microseconds)

  • Static IP addresses required

  • TCP/UDP protocols

  • PrivateLink endpoints

  • Use only when ALB doesn't meet requirements

Classic Load Balancer (CLB):

  • Legacy applications (pre-2016)

  • Don't use for new applications

ELB Architecture

Health Checks:

  • Every 30 seconds (configurable)

  • Unhealthy targets are removed automatically

  • No traffic sent to unhealthy instances

Cross-Zone Load Balancing:

  • Distributes evenly across ALL instances

  • Regardless of AZ

Understanding ELB Core Concepts

1. Target Groups

Target group = Collection of targets (instances, IPs, Lambda) that receive traffic

Target Group: web-servers
├── Protocol: HTTP
├── Port: 80
├── Health check: /health
├── Targets:
│   ├── i-1234567890 (healthy)
│   ├── i-abcdef1234 (healthy)
│   └── i-xyz9876543 (unhealthy - draining)
└── Load balancing algorithm: Round robin

Target types:

Instance targets:
└── EC2 instance IDs
    Example: i-1234567890abcdef0

IP targets:
└── IP addresses (on-prem, VPC, peered VPC)
    Example: 10.0.1.50, 192.168.1.100

Lambda targets (ALB only):
└── Lambda function ARN
    Example: arn:aws:lambda:us-east-1:123456789012:function:my-function

2. Listeners and Rules

Listener = Process that checks for connection requests

ALB Listener Configuration:

Listener: Port 443 (HTTPS)
├── SSL Certificate: ACM certificate
├── Security Policy: ELBSecurityPolicy-TLS13-1-2-2021-06
└── Default Action: Forward to default-tg

Rules (evaluated in priority order):
│
├── Rule 1 (Priority 1):
│   ├── Condition: Path is /api/*
│   └── Action: Forward to api-tg
│
├── Rule 2 (Priority 2):
│   ├── Condition: Host is admin.example.com
│   └── Action: Forward to admin-tg
│
├── Rule 3 (Priority 3):
│   ├── Condition: Header X-Custom-Header equals "special"
│   └── Action: Forward to special-tg
│
└── Default Rule:
    └── Action: Forward to default-tg

Routing conditions (ALB):

Path-based:
├── /api/* → API servers
├── /images/* → Static content servers
└── /* → Web servers

Host-based:
├── api.example.com → API servers
├── www.example.com → Web servers
└── admin.example.com → Admin servers

Header-based:
├── User-Agent contains "mobile" → Mobile servers
└── X-Custom-Header equals "value" → Special servers

Query string:
├── ?version=v2 → New version servers
└── ?version=v1 → Old version servers

Source IP:
├── 10.0.0.0/8 → Internal servers
└── Other → Public servers

3. Health Checks

Health checks = Periodic tests to determine target health

Health Check Configuration:

Protocol: HTTP
Path: /health
Port: traffic port (or custom)
Interval: 30 seconds
Timeout: 5 seconds
Healthy threshold: 2 consecutive successes
Unhealthy threshold: 3 consecutive failures
Success codes: 200

4. Connection Draining / Deregistration Delay

Deregistration delay = Time for in-flight requests to complete before target removal

Target Deregistration Process:

14:00:00 - Target marked for deregistration
         ├── New connections: Not accepted
         └── Existing connections: Allowed to complete

14:00:00 - 14:05:00 (300 seconds default delay)
         ├── Active requests: Continue processing
         ├── ELB waits for completion
         └── Graceful shutdown

14:05:00 - Target fully deregistered
         └── Any remaining connections forcibly closed

Configurable: 0-3600 seconds (default: 300)

When this matters:

  • Long-running requests (uploads, reports)

  • WebSocket connections

  • Prevent "connection reset" errors during deployments

5. Sticky Sessions (Session Affinity)

Sticky sessions = Route user to the same target for the duration of the session

Without Sticky Sessions:
User Request 1 → Server A (session created)
User Request 2 → Server B (session lost, login again)
User Request 3 → Server C (session lost, login again)

With Sticky Sessions:
User Request 1 → Server A (session created)
User Request 2 → Server A (session maintained)
User Request 3 → Server A (session maintained)

Cookie types:
├── Application-based (custom cookie name)
└── Duration-based (1 second - 7 days)

When NOT to use sticky sessions:

  • Use session storage instead (Redis, DynamoDB)

  • Sticky sessions reduce fault tolerance

  • If Server A fails, the user loses the session anyway

6. Cross-Zone Load Balancing

Cross-zone = Distribute traffic evenly across ALL targets, regardless of AZ

Scenario: 6 instances across 2 AZs

AZ-A: 2 instances
AZ-B: 4 instances

Without Cross-Zone:
├── AZ-A gets 50% of traffic
│   └── Each instance: 25%
├── AZ-B gets 50% of traffic
│   └── Each instance: 12.5%
└── Uneven distribution!

With Cross-Zone:
├── All 6 instances treated equally
└── Each instance: Will get equal amount of traffic

Pricing note:

  • ALB: Cross-zone enabled by default, no charge

  • NLB: Optional, charges for cross-zone data transfer

Top 3 Best Practices for DevOps

Best Practice 1: Use Application Load Balancer (ALB) for Modern Applications

Why ALB over NLB or CLB:

ALB Advantages for Web Applications:

1. Layer 7 Routing:
   ├── Path-based (/api/* vs /web/*)
   ├── Host-based (api.example.com vs www.example.com)
   ├── Header-based (A/B testing)
   └── Query parameter-based (versioning)

2. Native AWS Integration:
   ├── ECS/Fargate targets
   ├── Lambda function targets
   ├── Cognito authentication
   └── WAF (Web Application Firewall)

3. Advanced Features:
   ├── HTTP/2 support
   ├── WebSocket support
   ├── Request tracing (X-Amzn-Trace-Id)
   ├── Redirects (HTTP → HTTPS)
   └── Fixed responses (maintenance pages)

4. Cost-Effective:
   └── One ALB can route to multiple services
       (vs. one load balancer per service)

Microservices routing pattern:

Single ALB for entire application:

example.com/api/users → User Service (Port 8001)
example.com/api/orders → Order Service (Port 8002)
example.com/api/inventory → Inventory Service (Port 8003)
example.com/* → Frontend (Port 80)

Benefits:
• One load balancer ($16/month) instead of four ($64/month)
• Single SSL certificate
• Centralized access logs
• Simplified DNS (one domain)

Best Practice 2: Configure Comprehensive Health Checks

The problem:

Shallow health check:

  • Path: /

  • Response: 200 OK

Issues:

  • Web server running, but the database is down

  • Application running, but out of memory

  • Server responding, but requests failing

The solution: Deep health checks

Health check configuration:

# ALB Target Group with proper health checks
TargetGroup:
  Type: AWS::ElasticLoadBalancingV2::TargetGroup
  Properties:
    Name: web-tg
    VpcId: !Ref VPC
    Protocol: HTTP
    Port: 80

    HealthCheckEnabled: true
    HealthCheckProtocol: HTTP
    HealthCheckPath: /health
    HealthCheckIntervalSeconds: 30
    HealthCheckTimeoutSeconds: 5
    HealthyThresholdCount: 2
    UnhealthyThresholdCount: 3
    Matcher:
      HttpCode: '200'

    TargetType: instance
    Deregistration Delay: 30

Health check strategy:

Development/Staging:
├── Interval: 30 seconds (default)
├── Timeout: 5 seconds
├── Healthy: 2 checks
└── Unhealthy: 3 checks

Production (more aggressive):
├── Interval: 10 seconds (faster detection)
├── Timeout: 3 seconds
├── Healthy: 2 checks
└── Unhealthy: 2 checks (faster removal)

Critical Production:
├── Interval: 5 seconds
├── Timeout: 2 seconds
├── Healthy: 2 checks
└── Unhealthy: 2 checks

Best Practice 3: Enable Access Logs and Monitor Metrics

Why logging matters:

Without access logs:
• Can't debug user-reported issues
• No visibility into traffic patterns
• Can't identify attack patterns
• No performance analysis data

With access logs:
• Full request history
• Troubleshoot specific user issues
• Identify slow endpoints
• Detect DDoS patterns
• Analyze traffic sources

Enable access logs:

ApplicationLoadBalancer:
  Type: AWS::ElasticLoadBalancingV2::LoadBalancer
  Properties:
    LoadBalancerAttributes:
      - Key: access_logs.s3.enabled
        Value: 'true'
      - Key: access_logs.s3.bucket
        Value: !Ref ALBLogsBucket
      - Key: access_logs.s3.prefix
        Value: 'production-alb'
      - Key: deletion_protection.enabled
        Value: 'true'

CloudWatch metrics to monitor:

Key ALB Metrics:

Request Count:
└── Total requests processed

Target Response Time:
└── p50, p90, p99 latency

HTTP 4XX/5XX Errors:
└── Client errors vs server errors

Healthy/Unhealthy Host Count:
└── Track target health

Active Connection Count:
└── Concurrent connections

New Connection Count:
└── Connections per second

Rejected Connection Count:
└── Indicates saturation

CloudWatch alarms:

HighErrorRateAlarm:
  Type: AWS::CloudWatch::Alarm
  Properties:
    AlarmName: alb-high-error-rate
    MetricName: HTTPCode_Target_5XX_Count
    Namespace: AWS/ApplicationELB
    Statistic: Sum
    Period: 300
    EvaluationPeriods: 2
    Threshold: 100
    ComparisonOperator: GreaterThanThreshold
    Dimensions:
      - Name: LoadBalancer
        Value: !GetAtt ApplicationLoadBalancer.LoadBalancerFullName
    AlarmActions:
      - !Ref AlertTopic

Top 3 DevOps Use Cases

Use Case 1: High-Availability Web Application

The scenario:

Deploy a web application across multiple availability zones with automatic failover and health-based routing.

Architecture:

Failure scenarios:

Scenario 1: Single instance failure
├── Instance 2 becomes unhealthy
├── ALB detects via health checks (90 seconds)
├── ALB stops routing to Instance 2
├── Traffic redistributed to healthy instances
├── Auto Scaling launches replacement
└── Impact: None (transparent to users)

Scenario 2: Entire AZ failure
├── AZ-A becomes unavailable
├── ALB routes all traffic to AZ-B and AZ-C
├── Auto Scaling launches instances in healthy AZs
└── Impact: Minimal (brief performance degradation)

Scenario 3: Traffic spike
├── Traffic increases 10x
├── Auto Scaling launches new instances
├── ALB automatically distributes to new instances
└── Impact: None (automatic scaling)

Use Case 2: Microservices Architecture with Path-Based Routing

The scenario:

Route different microservices through a single ALB based on URL paths, reducing costs and complexity.

Architecture:

Listener rules:

# User Service Rule
UserServiceRule:
  Type: AWS::ElasticLoadBalancingV2::ListenerRule
  Properties:
    ListenerArn: !Ref HTTPSListener
    Priority: 10
    Conditions:
      - Field: path-pattern
        Values: ['/api/users', '/api/users/*']
    Actions:
      - Type: forward
        TargetGroupArn: !Ref UserServiceTargetGroup

# Order Service Rule
OrderServiceRule:
  Type: AWS::ElasticLoadBalancingV2::ListenerRule
  Properties:
    ListenerArn: !Ref HTTPSListener
    Priority: 20
    Conditions:
      - Field: path-pattern
        Values: ['/api/orders', '/api/orders/*']
    Actions:
      - Type: forward
        TargetGroupArn: !Ref OrderServiceTargetGroup

# Inventory Service Rule
InventoryServiceRule:
  Type: AWS::ElasticLoadBalancingV2::ListenerRule
  Properties:
    ListenerArn: !Ref HTTPSListener
    Priority: 30
    Conditions:
      - Field: path-pattern
        Values: ['/api/inventory', '/api/inventory/*']
    Actions:
      - Type: forward
        TargetGroupArn: !Ref InventoryServiceTargetGroup

Benefits:

Cost Savings:
├── 1 ALB: $16/month
└── vs 3 ALBs: $48/month
    Savings: $32/month (67%)

Operational Simplicity:
├── Single SSL certificate
├── One DNS record
├── Centralized access logs
└── One place to configure WAF

Service Independence:
├── Each service scales independently
├── Each service has own target group
├── Failure isolation (one service down ≠ all down)
└── Independent deployments

Use Case 3: Blue/Green Deployment with ALB

The scenario:

Deploy new application version without downtime using ALB target group switching.

Architecture:

Deployment Process:

  1. Deploy v2.0 to the Green target group

  2. Test Green (internal checks)

  3. Shift 10% traffic to Green

  4. Monitor metrics for 10 minutes

  5. If healthy: Shift 100% to Green

  6. If issues: Shift back to Blue (instant rollback)

Implementation with weighted target groups:

# Listener with weighted target groups
HTTPSListener:
  Type: AWS::ElasticLoadBalancingV2::Listener
  Properties:
    LoadBalancerArn: !Ref ApplicationLoadBalancer
    Port: 443
    Protocol: HTTPS
    DefaultActions:
      - Type: forward
        ForwardConfig:
          TargetGroups:
            - TargetGroupArn: !Ref BlueTargetGroup
              Weight: 100
            - TargetGroupArn: !Ref GreenTargetGroup
              Weight: 0

Common Pitfalls to Avoid

Pitfall 1: Shallow Health Checks

Problem: Health check on / doesn't verify database connectivity
Solution: Create /health endpoint that checks all dependencies

Pitfall 2: Not Enabling Access Logs

Problem: Can't troubleshoot user issues or analyze traffic
Solution: Enable S3 access logs from day one

Pitfall 3: Short Deregistration Delay

Problem: Long requests are terminated during deployments
Solution: Set appropriate delay (300s default, up to 3600s for long requests)

Pitfall 4: Using CLB for New Applications

Problem: Missing modern features (path routing, WebSocket, Lambda)
Solution: Use ALB for all new HTTP applications

Pitfall 5: Not Monitoring Target Health

Problem: Unaware when instances fail health checks
Solution: CloudWatch alarms on the UnhealthyHostCount metric

Pitfall 6: Single AZ Deployment

Problem: AZ failure = total outage
Solution: Deploy across a minimum of 3 AZs

Conclusion

AWS Elastic Load Balancing provides automatic traffic distribution, health-aware routing, and high availability without the operational overhead of self-managed load balancers.

Key takeaways:

  • Use ALB by default: Modern features, cost-effective, AWS-integrated

  • Comprehensive health checks: Verify all dependencies, not just the web server

  • Enable access logs: Essential for troubleshooting and analysis

  • Multi-AZ deployment: Minimum 3 AZs for HA

  • Monitor metrics: CloudWatch alarms for proactive detection

When to use ELB:

  • AWS-native applications

  • Need automatic failover

  • Want AWS-managed infrastructure

  • HTTP/HTTPS or TCP/UDP traffic

Load balancer selection:

  • ALB: 95% of use cases (web apps, microservices, containers)

  • NLB: Extreme performance, static IPs, PrivateLink

  • CLB: Don't use for new apps

From your perspective:

Your EKS project used the AWS Load Balancer Controller to provision ALBs automatically from Kubernetes Ingress manifests. The underlying ALB works identically whether provisioned via Kubernetes, CloudFormation, or the AWS Console—the same health checks, routing rules, and high availability apply.

Master ELB, and you've mastered AWS load balancing and high availability.

Questions or load-balancing experiences? Drop a comment!

#AWS #ELB #LoadBalancing #ALB #NLB #HighAvailability #DevOps #Microservices

Essential AWS Services For DevOps Engineer

Part 12 of 16

In this series, I will share the top 15 essential AWS services that every DevOps engineer should know. I will not only share what these services are but also share how and why those services are used in a production from a DevOps perspective.

Up next

Mastering AWS Route 53: A DevOps Guide to DNS and Traffic Management

Introduction It’s midnight. Your application database in the us-east-1 region just failed. You have a standby in the us-west-2 region, but your DNS points to the failed region. What happens next: Manual DNS update approach: Wake up, realize database...