How AWS Systems Manager Simplifies DevOps Tasks Without SSH

Introduction
You're a DevOps engineer responsible for 50 EC2 instances. It's Friday at 4 PM. A critical CVE ( Common Vulnerabilities and Exposures ) just dropped, a vulnerability in OpenSSL affecting every Linux server. Your security team wants all instances patched before Monday morning.
The manual approach:
SSH into each of 50 instances (one at a time)
Run
sudo apt update && sudo apt upgrade -y opensslVerify the patch applied
Log everything for compliance
Time: 4-6 hours minimum. Risk: You'll miss something.
The Systems Manager approach:
Create a Run Command targeting all 50 instances with
Environment=productiontagExecute the patch command fleet-wide
View results in a single dashboard
Session logs auto-saved to CloudWatch
Time: 15 minutes. Zero SSH. Full audit trail.
This is what AWS Systems Manager enables: operational tasks at scale, without the toil of manual SSH, without bastion hosts, and with complete visibility into your entire fleet.
The Problem: Manual Operations at Scale
Challenge 1: SSH at Scale is Unsustainable
The math of manual SSH:
Fleet size: 50 EC2 instances
Per-instance task: 5 minutes
SSH overhead: 2 minutes per instance
Total time: 50 × 7 minutes = 350 minutes = ~6 hours
For weekly maintenance:
• 50 instances × 30 minutes/week = 25 hours/week
• For one engineer: unsustainable
• For a team: expensive and error-prone
SSH security risks:
SSH drawbacks:
├── Port 22 open to internet (or VPN)
│ └── Attack surface for brute force
├── SSH key management
│ ├── Multiple key pairs per team member
│ ├── Keys shared (security anti-pattern)
│ ├── Leaked keys = full server access
│ └── No central key revocation
├── No session recording
│ ├── Who ran what command?
│ └── When? What output?
└── Bastion host complexity
├── Extra server to maintain
├── Single point of failure
└── Costs money
Challenge 2: Secrets Sprawl
Where configuration and secrets typically live:
Secrets scattered across:
├── .env files (committed to git accidentally)
├── application.properties (plain text)
├── Hardcoded in source code
├── EC2 user data scripts
├── Environment variables (visible in process list)
└── Jenkins/CI pipeline variables
Problems:
• Secrets visible to anyone with server access
• No rotation workflow
• No audit trail (who accessed what?)
• No versioning (can't roll back config)
• Different values per environment (drift)
Real-world consequences:
Developer commits
.envfile to GitHubScrapers find it within minutes
Database compromised
Incident response: 48 hours
Challenge 3: Configuration Drift
What drift looks like:
Monday: Deploy app with config version 1.2
Tuesday: Engineer SSHes in, manually tweaks nginx config
Wednesday: Another engineer SSHes in, changes memory settings
Thursday: Auto Scaling launches new instance (original config)
Friday: 3 instances with config 1.2, 1 with nginx tweak, 1 with memory tweak
Result:
• Inconsistent behavior across fleet
• "Works on some servers but not others"
• Impossible to diagnose issues
• Can't recreate state for debugging
Challenge 4: Patching Compliance
Manual patch management:
Security audit asks:
"Which servers are running OpenSSL < 3.0.8?"
Manual answer:
• SSH into each server
• Run: openssl version
• Record result in spreadsheet
• Report takes 3 days
Compliance status:
• Unknown until audit
• Patches applied inconsistently
• No enforcement mechanism
What is AWS Systems Manager?
AWS Systems Manager (SSM) is an operations hub for AWS infrastructure. It provides a unified interface to view and control your infrastructure, automating operational tasks across AWS resources.
The Value Proposition

SSM Capabilities Overview
AWS Systems Manager
├── Operations Management
│ ├── OpsCenter (incident tracking)
│ ├── Explorer (operations dashboard)
│ └── Incident Manager
├── Application Management
│ ├── Parameter Store (config/secrets)
│ ├── AppConfig (feature flags)
│ └── Application Manager
├── Change Management
│ ├── Automation (runbooks)
│ ├── Change Manager
│ └── Maintenance Windows
├── Node Management
│ ├── Fleet Manager (EC2 management)
│ ├── Session Manager (shell access)
│ ├── Run Command (fleet commands)
│ ├── Patch Manager (OS patching)
│ └── Inventory (system data)
└── Shared Resources
├── Documents (runbook definitions)
└── Parameter Store
SSM Agent
SSM Agent = lightweight software running on managed instances
Supported platforms:
├── Amazon Linux 2 (pre-installed)
├── Amazon Linux 2023 (pre-installed)
├── Ubuntu 16.04+ (pre-installed on AMIs)
├── Windows Server 2008+
├── macOS (managed nodes)
└── On-premises servers (hybrid activation)
Requirements:
├── SSM Agent installed
├── SSM IAM role attached (AmazonSSMManagedInstanceCore)
└── Outbound HTTPS (port 443) to SSM endpoints
(no inbound ports required!)
Understanding SSM Core Capabilities
1. Session Manager
Session Manager = Secure, browser-based or CLI-based shell access to instances without SSH.
How it works:
Traditional SSH flow:
User → Internet → Bastion Host (port 22) → EC2 Instance (port 22)
Session Manager flow:
User → AWS Console/CLI → SSM Service → SSM Agent on EC2
(No ports open, no keys needed)
Architecture:
┌─────────────────────────────────────────────────────────┐
│ Session Manager Flow │
└─────────────────────────────────────────────────────────┘
Engineer SSM Service EC2 Instance
│ │ │
│ StartSession API │ │
│──────────────────────► │ │
│ │ WebSocket tunnel │
│ ◄────────────────────│───────────────────────►│
│ │ (SSM Agent polls SSM) │
│ Interactive session │ │
│◄──────────────────────►│◄──────────────────────►│
│ │ │
│ All commands logged │ │
│ ┌────▼──────┐ │
│ │CloudWatch │ │
│ │ Logs │ │
│ └───────────┘ │
2. Parameter Store
Parameter Store = Hierarchical secrets and configuration storage.
Parameter types:
String:
• Plain text values
• Use: Non-sensitive config
• Example: /prod/app/log-level → "INFO"
StringList:
• Comma-separated values
• Use: Lists of values
• Example: /prod/app/allowed-ips → "10.0.0.1,10.0.0.2"
SecureString:
• Encrypted with KMS
• Use: Secrets, credentials
• Example: /prod/db/password → "s3cr3t!" (stored encrypted)
Hierarchy:
/
├── /prod/
│ ├── /prod/database/
│ │ ├── /prod/database/host
│ │ ├── /prod/database/port
│ │ ├── /prod/database/username
│ │ └── /prod/database/password (SecureString)
│ ├── /prod/redis/
│ │ └── /prod/redis/url
│ └── /prod/api/
│ └── /prod/api/stripe-secret-key (SecureString)
├── /staging/
│ └── ... (same structure, different values)
└── /dev/
└── ... (same structure, different values)
3. Run Command
Run Command = Execute commands across multiple instances simultaneously.
Document types:
AWS-Managed Documents:
├── AWS-RunShellScript (Linux commands)
├── AWS-RunPowerShellScript (Windows)
├── AWS-InstallApplication (software install)
├── AWS-UpdateSSMAgent (update SSM agent)
├── AWS-GatherSoftwareInventory (collect inventory)
└── AWS-ApplyPatchBaseline (apply patches)
Custom Documents:
└── Your own YAML/JSON runbooks
4. Patch Manager
Patch Manager = Automate OS and software patching.
Patch Process:
1. Patch Baseline:
• Define approved/rejected patches
• Auto-approve patches after N days
• Exceptions per CVE
2. Patch Group:
• Group instances by tag (PatchGroup=prod)
• Associate baseline with group
3. Maintenance Window:
• Schedule: cron(0 2 ? * TUE *) (Tuesdays 2 AM)
• Max concurrency: 25%
• Max error: 1%
• Tasks: Scan + Install
4. Results:
• Compliance report per instance
• Missing patches list
• CloudWatch metrics
5. Inventory
Inventory = Collect metadata from managed instances.
Collected Data:
├── Applications installed
├── AWS components (SSM Agent, CloudWatch Agent versions)
├── Network configuration (IPs, MACs)
├── Windows updates
├── Instance details (OS, CPU, memory)
├── Services running
├── Files (custom queries)
└── Registry (Windows)
Queried via:
• SSM Console
• Resource Data Sync → S3 → Athena
• AWS Config integration
Top 3 Best Practices for DevOps
Best Practice 1: Replace All SSH with Session Manager
Implementation steps:
1. Attach SSM role to EC2 instances:
# CloudFormation: EC2 with SSM role
Resources:
EC2InstanceRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal: {Service: ec2.amazonaws.com}
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
EC2InstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Roles: [!Ref EC2InstanceRole]
EC2Instance:
Type: AWS::EC2::Instance
Properties:
IamInstanceProfile: !Ref EC2InstanceProfile
# No key pair needed!
# KeyName: my-key-pair ← Remove this
# Security Group: No port 22
EC2SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: No SSH needed with SSM
SecurityGroupIngress:
# Only allow traffic from ALB on port 80/443
- IpProtocol: tcp
FromPort: 80
ToPort: 80
SourceSecurityGroupId: !Ref ALBSecurityGroup
2. Remove bastion host:
Before (Costly, Complex):
Internet (SSH) → Bastion Host → Private EC2
After (Free, Simple):
Engineer → AWS Console / CLI → Session Manager → Private EC2
3. Session logging for compliance:
# Create S3 bucket for session logs
aws s3 mb s3://my-ssm-session-logs
# Bucket policy (deny unencrypted)
aws s3api put-bucket-policy \
--bucket my-ssm-session-logs \
--policy '{
"Statement": [{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-ssm-session-logs/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "aws:kms"
}
}
}]
}'
4. Restrict who can start sessions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ssm:StartSession",
"Resource": [
"arn:aws:ec2:us-east-1:123456789012:instance/*"
],
"Condition": {
"StringEquals": {
"ssm:resourceTag/Environment": ["production"],
"ssm:resourceTag/AllowSSM": ["true"]
}
}
}
]
}
Audit trail:
Every session records:
• Who connected (IAM user/role)
• When session started/ended
• All commands executed
• All output
• Session stored in S3/CloudWatch
Compliance benefits:
• SOC 2: Access logging
• PCI DSS: System access records
• HIPAA: Audit controls
• ISO 27001: Access management
Best Practice 2: Hierarchical Parameter Store for All Config
Problem with flat secrets management:
Bad: Flat parameters
/db-host-prod
/db-password-prod
/db-host-staging
/db-password-staging
Issues:
• No clear ownership
• Hard to grant environment-specific access
• Hard to retrieve all params for an environment
Solution: Hierarchical namespacing with IAM paths:
Good: Hierarchical structure
/prod/database/host
/prod/database/password
/staging/database/host
/staging/database/password
Benefits:
• IAM policy by path prefix
• GetParametersByPath = all app config in one call
• Clear ownership
IAM policy using paths:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:GetParametersByPath"
],
"Resource": "arn:aws:ssm:us-east-1:123456789012:parameter/prod/*"
},
{
"Effect": "Allow",
"Action": "kms:Decrypt",
"Resource": "arn:aws:kms:us-east-1:123456789012:key/prod-ssm-key"
}
]
}
Rotate secrets:
# Update a parameter (rotation)
aws ssm put-parameter \
--name "/prod/database/password" \
--value "new-rotated-password-here" \
--type SecureString \
--key-id "alias/prod-ssm-key" \
--overwrite
# Parameter versioning is automatic
# Application fetches latest on restart
# Old version preserved (rollback capability)
Parameter Store vs Secrets Manager:
Parameter Store (Free tier):
✓ Config values + secrets
✓ Free for standard parameters
✓ KMS encryption
✓ IAM-controlled
✗ No auto-rotation
✗ No cross-account sharing
Use: App config, API keys, DB passwords
Secrets Manager ($0.40/secret/month):
✓ Auto-rotation built-in
✓ Cross-account access
✓ Native RDS/Redshift rotation
✓ Better for frequently rotated creds
Use: RDS master passwords, API keys needing rotation
Best Practice 3: Automate Patching with Maintenance Windows
Patching strategy:
Dev Environment:
• Patch baseline: All available patches
• Schedule: Daily
• Concurrency: 100% (patch all at once)
• No maintenance window needed
Staging:
• Patch baseline: Security patches (7-day approval delay)
• Schedule: Weekly (Mondays 2 AM)
• Concurrency: 50%
• Test before production
Production:
• Patch baseline: Security (30-day delay) + Critical (7-day)
• Schedule: Monthly (last Tuesday 2 AM)
• Concurrency: 25% (roll through fleet)
• Max error threshold: 1%
• Notification before + after
Top 3 DevOps Use Cases
Use Case 1: Zero-Bastion Secure Access Architecture
The scenario:
Provide developers and ops engineers secure, audited access to EC2 instances without bastion hosts or open SSH ports.
Architecture:
┌─────────────────────────────────────────────────────────┐
│ Zero-Bastion Architecture with SSM │
└─────────────────────────────────────────────────────────┘
Before:
Engineer → VPN → Bastion (port 22) → Private EC2 (port 22)
↑ Port 22 open in SG
After:
Engineer → AWS CLI / Console
↓
IAM Auth check
↓
SSM Service
↓ (WSS tunnel, no inbound ports)
SSM Agent on EC2
↓
Shell session
↓
Session logged to CloudWatch
Access tiers:
Tier 1: Read-only access (developers)
├── Can start Session Manager sessions
├── Cannot sudo to root
└── Session logged and monitored
Tier 2: Application access (senior devs)
├── Can start sessions
├── Can restart application services
└── Cannot modify system config
Tier 3: Admin access (ops engineers)
├── Full shell access
├── Sudo allowed
└── All sessions logged, reviewed weekly
All tiers:
└── No SSH keys, no bastion hosts
Use Case 2: Fleet-Wide Configuration Management
The scenario:
Roll out configuration changes (install agents, update config files, restart services) across entire EC2 fleet simultaneously.
Common fleet operations with Run Command:
# 1. Install CloudWatch agent on all production servers
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--targets "Key=tag:Environment,Values=production" \
--parameters 'commands=[
"sudo yum install -y amazon-cloudwatch-agent",
"sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:/prod/cloudwatch-config -s",
"echo CloudWatch agent installed and configured"
]'
# 2. Rotate application config (after Parameter Store update)
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--targets "Key=tag:App,Values=api-server" \
--parameters 'commands=[
"sudo systemctl restart api-server",
"sleep 10",
"systemctl is-active api-server && echo SUCCESS || echo FAILED"
]'
# 3. Collect diagnostic info from all servers
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--targets "Key=tag:Environment,Values=production" \
--parameters 'commands=[
"echo === System Info ===",
"uname -a",
"df -h",
"free -h",
"echo === App Status ===",
"systemctl status api-server --no-pager",
"echo === Recent Errors ===",
"journalctl -u api-server -n 20 --no-pager"
]' \
--output-s3-bucket-name ssm-diagnostics-output
Use Case 3: Centralized Application Configuration Without Restart
The scenario:
Use Parameter Store + SSM AppConfig for dynamic feature flags and configuration that updates without application restart.
Architecture:
┌─────────────────────────────────────────────────────────┐
│ Dynamic Config with Parameter Store │
└─────────────────────────────────────────────────────────┘
Deploy sequence:
1. Update parameter in SSM:
/prod/feature-flags/new-checkout → "enabled"
2. Application polls Parameter Store every 60s
(or triggered by EventBridge)
3. Application reads new value
→ Feature flag enabled without restart
Benefits:
• No deployment for config changes
• Instant rollback (update parameter back)
• Audit trail in CloudTrail
• Per-environment values
Common Pitfalls to Avoid
Pitfall 1: Forgetting SSM Agent + IAM Role
Problem: SSM doesn't work without both
Solution: Always attach AmazonSSMManagedInstanceCore policy and verify the agent is running
Pitfall 2: Flat Parameter Naming
Problem: Can't grant environment-specific access
Solution: Use hierarchical paths (/env/service/param)
Pitfall 3: Not Setting Session Idle Timeout
Problem: Long-running abandoned sessions
Solution: Set a 20-minute idle timeout in the Session Manager preferences
Pitfall 4: Using String for Secrets
Problem: Passwords visible in plain text
Solution: Always use SecureString with KMS for secrets
Pitfall 5: 100% Concurrency for Patching
Problem: All instances patching simultaneously = downtime
Solution: Use 25% concurrency with 1% error threshold
Pitfall 6: No Parameter Versioning Strategy
Problem: Can't rollback to previous config
Solution: SSM auto-versions parameters (default 100 versions retained)
Conclusion
AWS Systems Manager transforms how DevOps teams operate EC2 infrastructure—eliminating bastion hosts, centralizing secrets management, and automating patching across fleets of any size.
Key takeaways:
Session Manager: Replace SSH and bastion hosts entirely
Parameter Store: Centralize all config and secrets
Patch Manager: Automate OS patching with compliance reporting
Run Command: Fleet-wide operations without SSH
Inventory: Full visibility into your fleet state
SSM vs alternatives:
| Task | Manual | SSM | Notes |
| Shell access | SSH + Bastion | Session Manager | SSM free, more secure |
| Config/secrets | .env files | Parameter Store | Free tier generous |
| OS patching | Manual SSH | Patch Manager | Free for EC2 |
| Fleet commands | Ansible | Run Command | Free, no infra needed |
Questions or SSM tips? Drop a comment!
#AWS #SystemsManager #SSM #DevOps #Security #Automation #SecretManagement



