Understanding AWS Elastic Load Balancing for Reliable DevOps Solutions

Introduction
It's Monday morning. Your single web server is humming along, serving 100 requests per second. Then TechCrunch publishes an article about your product.
9:00 AM: 500 requests/second
9:15 AM: 2,000 requests/second
9:30 AM: Server CPU at 100%
9:32 AM: Server crashes
9:33 AM: Website down, error 503
Or imagine this: You have three servers behind a load balancer. One server develops a memory leak and starts responding slowly. Users randomly hit the slow server, get a poor experience, and complain about your "unreliable" service.
Both scenarios have the same root cause: lack of intelligent traffic distribution and health-aware routing.
This is where AWS Elastic Load Balancing (ELB) comes in - automatically distributing traffic across healthy instances, automatically scaling to handle millions of requests, and providing high availability without manual intervention.
This guide covers everything a DevOps engineer needs to know about AWS load balancing, from choosing the right type to production-ready configurations.
The Problem: Single Points of Failure and Manual Failover
Let's examine what happens without proper load balancing.
Challenge 1: Single Server Bottleneck
Scenario: E-commerce application on one EC2 instance
Normal Traffic (100 req/sec):
CPU: 20%
Memory: 40%
Response time: 50ms
Everything works fine
Black Friday (5,000 req/sec):
CPU: 100%
Memory: 95%
Response time: 5,000ms
Server crashes after 10 minutes
Result:
Lost sales during peak shopping time
Customers frustrated
Reputation damage
Why one server fails:
CPU bottleneck (can only process so many requests)
Memory exhaustion (too many concurrent connections)
Network bandwidth limit
No redundancy (server crashes = total outage)
Challenge 2: Manual Failover Complexity
Scenario: Three servers, one fails
Server 1 (10.0.1.10) - Healthy
Server 2 (10.0.1.11) - CRASHED
Server 3 (10.0.1.12) - Healthy
Without Load Balancer:
Users connecting to Server 2:
Connection timeout
Application error
Bad user experience
Manual Failover Process:
Detect Server 2 is down (5-15 minutes)
Update DNS / Route53 (manual)
Wait for DNS propagation (5-60 minutes)
Users gradually shift to healthy servers
Total recovery time: 10-75 minutes (assumptions)
Challenge 3: Uneven Traffic Distribution
Scenario: Manual round-robin DNS
DNS Round Robin:
example.com resolves to:
├── 10.0.1.10 (33% of clients)
├── 10.0.1.11 (33% of clients)
└── 10.0.1.12 (33% of clients)
Problems:
1. Client-side caching
└── Users stuck on one IP (no redistribution)
2. No health awareness
└── DNS returns crashed server IPs
3. No session affinity
└── User session lost if switched servers
4. Can't handle varying instance sizes
└── t3.micro gets same traffic as m5.xlarge
Challenge 4: SSL/TLS Termination Overhead
Without a load balancer:
Each server handles SSL:
CPU overhead for TLS handshake
Certificate management on all servers
Cipher suite configuration per server
Certificate renewal on all servers
With 10 servers:
10× certificate installations
10× configurations to maintain
10× points of failure
Challenge 5: Zero-Downtime Deployment Difficulty
Rolling deployment without a load balancer:
Manual Process:
Remove Server 1 from pool (manually)
Deploy new version to Server 1
Test Server 1
Add Server 1 back to pool
Repeat for Server 2, 3, 4...
Issues:
Manual coordination required
Risk of human error
Slow deployment process
If deployment fails midway, an inconsistent state
What is AWS Elastic Load Balancing?
AWS Elastic Load Balancing (ELB) automatically distributes incoming traffic across multiple targets (EC2 instances, containers, IP addresses, Lambda functions) in multiple Availability Zones.
The Value Proposition

Load Balancer Types
AWS offers three types of load balancers:

When to use each:
Application Load Balancer (ALB):
Modern web applications (HTTP/HTTPS)
Microservices architectures
Container-based applications
Lambda functions
Content-based routing needed
Recommendation: Use this by default
Network Load Balancer (NLB):
Extreme performance required (millions of requests/sec)
Ultra-low latency needed (microseconds)
Static IP addresses required
TCP/UDP protocols
PrivateLink endpoints
Use only when ALB doesn't meet requirements
Classic Load Balancer (CLB):
Legacy applications (pre-2016)
Don't use for new applications
ELB Architecture

Health Checks:
Every 30 seconds (configurable)
Unhealthy targets are removed automatically
No traffic sent to unhealthy instances
Cross-Zone Load Balancing:
Distributes evenly across ALL instances
Regardless of AZ
Understanding ELB Core Concepts
1. Target Groups
Target group = Collection of targets (instances, IPs, Lambda) that receive traffic
Target Group: web-servers
├── Protocol: HTTP
├── Port: 80
├── Health check: /health
├── Targets:
│ ├── i-1234567890 (healthy)
│ ├── i-abcdef1234 (healthy)
│ └── i-xyz9876543 (unhealthy - draining)
└── Load balancing algorithm: Round robin
Target types:
Instance targets:
└── EC2 instance IDs
Example: i-1234567890abcdef0
IP targets:
└── IP addresses (on-prem, VPC, peered VPC)
Example: 10.0.1.50, 192.168.1.100
Lambda targets (ALB only):
└── Lambda function ARN
Example: arn:aws:lambda:us-east-1:123456789012:function:my-function
2. Listeners and Rules
Listener = Process that checks for connection requests
ALB Listener Configuration:
Listener: Port 443 (HTTPS)
├── SSL Certificate: ACM certificate
├── Security Policy: ELBSecurityPolicy-TLS13-1-2-2021-06
└── Default Action: Forward to default-tg
Rules (evaluated in priority order):
│
├── Rule 1 (Priority 1):
│ ├── Condition: Path is /api/*
│ └── Action: Forward to api-tg
│
├── Rule 2 (Priority 2):
│ ├── Condition: Host is admin.example.com
│ └── Action: Forward to admin-tg
│
├── Rule 3 (Priority 3):
│ ├── Condition: Header X-Custom-Header equals "special"
│ └── Action: Forward to special-tg
│
└── Default Rule:
└── Action: Forward to default-tg
Routing conditions (ALB):
Path-based:
├── /api/* → API servers
├── /images/* → Static content servers
└── /* → Web servers
Host-based:
├── api.example.com → API servers
├── www.example.com → Web servers
└── admin.example.com → Admin servers
Header-based:
├── User-Agent contains "mobile" → Mobile servers
└── X-Custom-Header equals "value" → Special servers
Query string:
├── ?version=v2 → New version servers
└── ?version=v1 → Old version servers
Source IP:
├── 10.0.0.0/8 → Internal servers
└── Other → Public servers
3. Health Checks
Health checks = Periodic tests to determine target health
Health Check Configuration:
Protocol: HTTP
Path: /health
Port: traffic port (or custom)
Interval: 30 seconds
Timeout: 5 seconds
Healthy threshold: 2 consecutive successes
Unhealthy threshold: 3 consecutive failures
Success codes: 200
4. Connection Draining / Deregistration Delay
Deregistration delay = Time for in-flight requests to complete before target removal
Target Deregistration Process:
14:00:00 - Target marked for deregistration
├── New connections: Not accepted
└── Existing connections: Allowed to complete
14:00:00 - 14:05:00 (300 seconds default delay)
├── Active requests: Continue processing
├── ELB waits for completion
└── Graceful shutdown
14:05:00 - Target fully deregistered
└── Any remaining connections forcibly closed
Configurable: 0-3600 seconds (default: 300)
When this matters:
Long-running requests (uploads, reports)
WebSocket connections
Prevent "connection reset" errors during deployments
5. Sticky Sessions (Session Affinity)
Sticky sessions = Route user to the same target for the duration of the session
Without Sticky Sessions:
User Request 1 → Server A (session created)
User Request 2 → Server B (session lost, login again)
User Request 3 → Server C (session lost, login again)
With Sticky Sessions:
User Request 1 → Server A (session created)
User Request 2 → Server A (session maintained)
User Request 3 → Server A (session maintained)
Cookie types:
├── Application-based (custom cookie name)
└── Duration-based (1 second - 7 days)
When NOT to use sticky sessions:
Use session storage instead (Redis, DynamoDB)
Sticky sessions reduce fault tolerance
If Server A fails, the user loses the session anyway
6. Cross-Zone Load Balancing
Cross-zone = Distribute traffic evenly across ALL targets, regardless of AZ
Scenario: 6 instances across 2 AZs
AZ-A: 2 instances
AZ-B: 4 instances
Without Cross-Zone:
├── AZ-A gets 50% of traffic
│ └── Each instance: 25%
├── AZ-B gets 50% of traffic
│ └── Each instance: 12.5%
└── Uneven distribution!
With Cross-Zone:
├── All 6 instances treated equally
└── Each instance: Will get equal amount of traffic
Pricing note:
ALB: Cross-zone enabled by default, no charge
NLB: Optional, charges for cross-zone data transfer
Top 3 Best Practices for DevOps
Best Practice 1: Use Application Load Balancer (ALB) for Modern Applications
Why ALB over NLB or CLB:
ALB Advantages for Web Applications:
1. Layer 7 Routing:
├── Path-based (/api/* vs /web/*)
├── Host-based (api.example.com vs www.example.com)
├── Header-based (A/B testing)
└── Query parameter-based (versioning)
2. Native AWS Integration:
├── ECS/Fargate targets
├── Lambda function targets
├── Cognito authentication
└── WAF (Web Application Firewall)
3. Advanced Features:
├── HTTP/2 support
├── WebSocket support
├── Request tracing (X-Amzn-Trace-Id)
├── Redirects (HTTP → HTTPS)
└── Fixed responses (maintenance pages)
4. Cost-Effective:
└── One ALB can route to multiple services
(vs. one load balancer per service)
Microservices routing pattern:
Single ALB for entire application:
example.com/api/users → User Service (Port 8001)
example.com/api/orders → Order Service (Port 8002)
example.com/api/inventory → Inventory Service (Port 8003)
example.com/* → Frontend (Port 80)
Benefits:
• One load balancer ($16/month) instead of four ($64/month)
• Single SSL certificate
• Centralized access logs
• Simplified DNS (one domain)
Best Practice 2: Configure Comprehensive Health Checks
The problem:
Shallow health check:
Path: /
Response: 200 OK
Issues:
Web server running, but the database is down
Application running, but out of memory
Server responding, but requests failing
The solution: Deep health checks
Health check configuration:
# ALB Target Group with proper health checks
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: web-tg
VpcId: !Ref VPC
Protocol: HTTP
Port: 80
HealthCheckEnabled: true
HealthCheckProtocol: HTTP
HealthCheckPath: /health
HealthCheckIntervalSeconds: 30
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 2
UnhealthyThresholdCount: 3
Matcher:
HttpCode: '200'
TargetType: instance
Deregistration Delay: 30
Health check strategy:
Development/Staging:
├── Interval: 30 seconds (default)
├── Timeout: 5 seconds
├── Healthy: 2 checks
└── Unhealthy: 3 checks
Production (more aggressive):
├── Interval: 10 seconds (faster detection)
├── Timeout: 3 seconds
├── Healthy: 2 checks
└── Unhealthy: 2 checks (faster removal)
Critical Production:
├── Interval: 5 seconds
├── Timeout: 2 seconds
├── Healthy: 2 checks
└── Unhealthy: 2 checks
Best Practice 3: Enable Access Logs and Monitor Metrics
Why logging matters:
Without access logs:
• Can't debug user-reported issues
• No visibility into traffic patterns
• Can't identify attack patterns
• No performance analysis data
With access logs:
• Full request history
• Troubleshoot specific user issues
• Identify slow endpoints
• Detect DDoS patterns
• Analyze traffic sources
Enable access logs:
ApplicationLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
LoadBalancerAttributes:
- Key: access_logs.s3.enabled
Value: 'true'
- Key: access_logs.s3.bucket
Value: !Ref ALBLogsBucket
- Key: access_logs.s3.prefix
Value: 'production-alb'
- Key: deletion_protection.enabled
Value: 'true'
CloudWatch metrics to monitor:
Key ALB Metrics:
Request Count:
└── Total requests processed
Target Response Time:
└── p50, p90, p99 latency
HTTP 4XX/5XX Errors:
└── Client errors vs server errors
Healthy/Unhealthy Host Count:
└── Track target health
Active Connection Count:
└── Concurrent connections
New Connection Count:
└── Connections per second
Rejected Connection Count:
└── Indicates saturation
CloudWatch alarms:
HighErrorRateAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: alb-high-error-rate
MetricName: HTTPCode_Target_5XX_Count
Namespace: AWS/ApplicationELB
Statistic: Sum
Period: 300
EvaluationPeriods: 2
Threshold: 100
ComparisonOperator: GreaterThanThreshold
Dimensions:
- Name: LoadBalancer
Value: !GetAtt ApplicationLoadBalancer.LoadBalancerFullName
AlarmActions:
- !Ref AlertTopic
Top 3 DevOps Use Cases
Use Case 1: High-Availability Web Application
The scenario:
Deploy a web application across multiple availability zones with automatic failover and health-based routing.
Architecture:

Failure scenarios:
Scenario 1: Single instance failure
├── Instance 2 becomes unhealthy
├── ALB detects via health checks (90 seconds)
├── ALB stops routing to Instance 2
├── Traffic redistributed to healthy instances
├── Auto Scaling launches replacement
└── Impact: None (transparent to users)
Scenario 2: Entire AZ failure
├── AZ-A becomes unavailable
├── ALB routes all traffic to AZ-B and AZ-C
├── Auto Scaling launches instances in healthy AZs
└── Impact: Minimal (brief performance degradation)
Scenario 3: Traffic spike
├── Traffic increases 10x
├── Auto Scaling launches new instances
├── ALB automatically distributes to new instances
└── Impact: None (automatic scaling)
Use Case 2: Microservices Architecture with Path-Based Routing
The scenario:
Route different microservices through a single ALB based on URL paths, reducing costs and complexity.
Architecture:

Listener rules:
# User Service Rule
UserServiceRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
ListenerArn: !Ref HTTPSListener
Priority: 10
Conditions:
- Field: path-pattern
Values: ['/api/users', '/api/users/*']
Actions:
- Type: forward
TargetGroupArn: !Ref UserServiceTargetGroup
# Order Service Rule
OrderServiceRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
ListenerArn: !Ref HTTPSListener
Priority: 20
Conditions:
- Field: path-pattern
Values: ['/api/orders', '/api/orders/*']
Actions:
- Type: forward
TargetGroupArn: !Ref OrderServiceTargetGroup
# Inventory Service Rule
InventoryServiceRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
ListenerArn: !Ref HTTPSListener
Priority: 30
Conditions:
- Field: path-pattern
Values: ['/api/inventory', '/api/inventory/*']
Actions:
- Type: forward
TargetGroupArn: !Ref InventoryServiceTargetGroup
Benefits:
Cost Savings:
├── 1 ALB: $16/month
└── vs 3 ALBs: $48/month
Savings: $32/month (67%)
Operational Simplicity:
├── Single SSL certificate
├── One DNS record
├── Centralized access logs
└── One place to configure WAF
Service Independence:
├── Each service scales independently
├── Each service has own target group
├── Failure isolation (one service down ≠ all down)
└── Independent deployments
Use Case 3: Blue/Green Deployment with ALB
The scenario:
Deploy new application version without downtime using ALB target group switching.
Architecture:

Deployment Process:
Deploy v2.0 to the Green target group
Test Green (internal checks)
Shift 10% traffic to Green
Monitor metrics for 10 minutes
If healthy: Shift 100% to Green
If issues: Shift back to Blue (instant rollback)
Implementation with weighted target groups:
# Listener with weighted target groups
HTTPSListener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
LoadBalancerArn: !Ref ApplicationLoadBalancer
Port: 443
Protocol: HTTPS
DefaultActions:
- Type: forward
ForwardConfig:
TargetGroups:
- TargetGroupArn: !Ref BlueTargetGroup
Weight: 100
- TargetGroupArn: !Ref GreenTargetGroup
Weight: 0
Common Pitfalls to Avoid
Pitfall 1: Shallow Health Checks
Problem: Health check on / doesn't verify database connectivity
Solution: Create /health endpoint that checks all dependencies
Pitfall 2: Not Enabling Access Logs
Problem: Can't troubleshoot user issues or analyze traffic
Solution: Enable S3 access logs from day one
Pitfall 3: Short Deregistration Delay
Problem: Long requests are terminated during deployments
Solution: Set appropriate delay (300s default, up to 3600s for long requests)
Pitfall 4: Using CLB for New Applications
Problem: Missing modern features (path routing, WebSocket, Lambda)
Solution: Use ALB for all new HTTP applications
Pitfall 5: Not Monitoring Target Health
Problem: Unaware when instances fail health checks
Solution: CloudWatch alarms on the UnhealthyHostCount metric
Pitfall 6: Single AZ Deployment
Problem: AZ failure = total outage
Solution: Deploy across a minimum of 3 AZs
Conclusion
AWS Elastic Load Balancing provides automatic traffic distribution, health-aware routing, and high availability without the operational overhead of self-managed load balancers.
Key takeaways:
Use ALB by default: Modern features, cost-effective, AWS-integrated
Comprehensive health checks: Verify all dependencies, not just the web server
Enable access logs: Essential for troubleshooting and analysis
Multi-AZ deployment: Minimum 3 AZs for HA
Monitor metrics: CloudWatch alarms for proactive detection
When to use ELB:
AWS-native applications
Need automatic failover
Want AWS-managed infrastructure
HTTP/HTTPS or TCP/UDP traffic
Load balancer selection:
ALB: 95% of use cases (web apps, microservices, containers)
NLB: Extreme performance, static IPs, PrivateLink
CLB: Don't use for new apps
From your perspective:
Your EKS project used the AWS Load Balancer Controller to provision ALBs automatically from Kubernetes Ingress manifests. The underlying ALB works identically whether provisioned via Kubernetes, CloudFormation, or the AWS Console—the same health checks, routing rules, and high availability apply.
Master ELB, and you've mastered AWS load balancing and high availability.
Questions or load-balancing experiences? Drop a comment!
#AWS #ELB #LoadBalancing #ALB #NLB #HighAvailability #DevOps #Microservices



