Last updated: March 16, 2026
Implement SSH agent forwarding for small teams as a starting point, use dedicated tools like Teleport or HashiCorp Vault for enterprise-scale teams needing audit trails and access controls, or combine OIDC authentication with cloud provider-native solutions for minimal friction. The key is reducing manual key rotation while maintaining visibility into who accesses production infrastructure.
Table of Contents
- The SSH Key Management Problem
- SSH Agent Forwarding and Key Chaining
- Implementing a Centralized SSH Key Directory
- Using Ansible for SSH Key Distribution
- GitOps-Based SSH Key Management
- Short-Lived SSH Certificates
- SSH Key Rotation Strategies
- Managed SSH Key Solutions
- Monitoring and Auditing SSH Access
- Practical Recommendations
- Migrating from Password to Key-Based Authentication
- Incident Response for Compromised Keys
The SSH Key Management Problem
Remote engineering teams typically face several key management challenges. Developers need access to production servers, staging environments, and various internal services. Each developer might have multiple keys for different purposes—a personal key, a work key, and keys for specific projects. When team members leave or roles change, revoking access quickly becomes critical.
The traditional approach of manually distributing and tracking SSH keys doesn’t scale. Without centralized management, you lose visibility into who has access to what, key rotation becomes infrequent, and compromised keys create security vulnerabilities that go undetected.
SSH Agent Forwarding and Key Chaining
For smaller teams or those starting with minimal infrastructure, SSH agent forwarding provides a straightforward starting point. This method allows developers to use their local SSH agent when connecting through intermediate servers.
# Add your key to the SSH agent
ssh-add -k ~/.ssh/id_ed25519
# Connect with agent forwarding
ssh -A user@jump-server.example.com
The -A flag enables agent forwarding, allowing the connection to use your local keys through the jump server. While convenient, this approach has limitations for larger teams. Agent forwarding requires trust in intermediate servers, and tracking which keys have access to which systems becomes difficult.
Implementing a Centralized SSH Key Directory
A more structured approach involves maintaining a centralized directory of authorized keys. This works particularly well for teams with their own infrastructure.
# Directory structure for centralized key management
# /opt/ssh-keys/
# ├── users/
# │ ├── alice.pub
# │ ├── bob.pub
# │ └── charlie.pub
# ├── servers/
# │ ├── production/
# │ │ ├── app-server-1.authorized_keys
# │ │ └── app-server-2.authorized_keys
# │ ├── staging/
# │ └── development/
# └── groups/
# ├── devops.pub
# └── backend-team.pub
This structure separates keys by user, server environment, and team. Administrators can manage access by adding or removing keys from specific files, and version control of this directory provides an audit trail.
Using Ansible for SSH Key Distribution
Ansible excels at managing SSH keys across multiple servers. This approach combines automation with infrastructure-as-code principles.
# ansible/playbooks/ssh-key-management.yml---
- name: Manage SSH keys across servers
hosts: all
become: yes
tasks:
- name: Ensure SSH directory exists
file:
path: "/home/{{ ansible_user }}/.ssh"
state: directory
owner: "{{ ansible_user }}"
mode: '0700'
- name: Deploy authorized_keys file
copy:
src: "files/authorized_keys/{{ ansible_hostname }}/"
dest: "/home/{{ ansible_user }}/.ssh/authorized_keys"
owner: "{{ ansible_user }}"
mode: '0600'
With this playbook, you maintain authorized_keys files in version control, and Ansible distributes them to servers. Adding a new developer involves adding their key to the appropriate file and running the playbook.
GitOps-Based SSH Key Management
For teams already using GitOps workflows, storing SSH key configurations alongside infrastructure code makes sense. This approach treats SSH key management as part of your codebase.
# Example structure in your infrastructure repository
# infrastructure/
# ├── ssh/
# │ ├── keys/
# │ │ └── users/
# │ │ └── developer-keys/
# │ │ ├── alice_ed25519.pub
# │ │ └── bob_ed25519.pub
# │ ├── templates/
# │ │ └── authorized_keys.j2
# │ └── scripts/
# │ ├── add_key.sh
# │ ├── remove_key.sh
# │ └── rotate_keys.sh
A key rotation script might look like:
#!/bin/bash
# scripts/rotate_keys.sh
set -euo pipefail
KEY_DIR="keys/users"
TARGET_SERVER="$1"
NEW_KEY_NAME="$2"
# Generate new key pair
ssh-keygen -t ed25519 -f "${KEY_DIR}/${NEW_KEY_NAME}" -N "" -C "${NEW_KEY_NAME}@$(hostname)"
# Add new public key to authorized_keys template
cat "${KEY_DIR}/${NEW_KEY_NAME}.pub" >> "templates/authorized_keys.j2"
# Notify team about new key
echo "New key ${NEW_KEY_NAME} generated and added to template"
echo "Run deployment to apply changes"
Short-Lived SSH Certificates
For high-security environments, SSH certificates provide superior access control compared to traditional public keys. Certificates eliminate the need for per-server key distribution and enable time-limited access.
# Generate CA key pair
ssh-keygen -t ed25519 -f ssh_ca -C "team-ca"
# Sign a user certificate (valid for 24 hours)
ssh-keygen -s ssh_ca -I "developer-alice" \
-V "+24h" \
-z "20240315" \
id_alice.pub
# The signed certificate (id_alice-cert.pub) can now authenticate
# without being added to individual server authorized_keys files
Servers trust the CA key rather than individual user keys. When access needs revocation, you add the principal to a revocation list rather than removing keys from every server. This scales significantly better than traditional key management.
Configure servers to trust your CA:
# On each server, add to /etc/ssh/sshd_config
TrustedUserCAKeys /etc/ssh/trusted_ca.pub
SSH Key Rotation Strategies
Manual key rotation is error-prone and slow at scale. Implement automated rotation that removes stale keys and issues new ones on a regular cadence.
A rotation script might follow this pattern:
#!/bin/bash
# Rotate SSH keys every 90 days
ROTATION_INTERVAL_DAYS=90
KEY_DIR="/opt/ssh-keys/users"
ARCHIVE_DIR="/opt/ssh-keys/archive"
for pubkey in "$KEY_DIR"/*.pub; do
modified_days=$(( ($(date +%s) - $(stat -f%m "$pubkey" 2>/dev/null || stat -c%Y "$pubkey")) / 86400 ))
if [ "$modified_days" -gt "$ROTATION_INTERVAL_DAYS" ]; then
key_user=$(basename "$pubkey" .pub)
echo "Rotating key for $key_user"
# Archive old key
mkdir -p "$ARCHIVE_DIR/$(date +%Y-%m)"
mv "$pubkey" "$ARCHIVE_DIR/$(date +%Y-%m)/"
# Notify user to provide new key
echo "Key rotation needed for $key_user" | mail -s "SSH Key Rotation Required" "$key_user@company.com"
fi
done
Schedule this script to run weekly, creating a regular rotation cadence that keeps keys fresh.
Managed SSH Key Solutions
Several commercial and open-source tools provide full-featured SSH key management without building custom infrastructure.
Teleport offers zero-trust access with SSH certificate-based authentication. It integrates with identity providers (Okta, GitHub Enterprise, others) and provides complete session recording. Every SSH connection is logged and can be audited later. For teams needing regulatory compliance, Teleport’s audit trail is invaluable.
Smallstep focuses on certificate-based SSH access with automated rotation and fine-grained access policies. It integrates with existing identity providers and automates much of the certificate lifecycle management.
HashiCorp Vault can manage SSH keys and provide dynamic SSH credentials, useful for teams already using Vault for secrets management. It supports one-time passwords for SSH access, eliminating persistent keys entirely.
For most distributed remote engineering teams, starting with a structured file-based approach using Ansible or similar tools provides good balance of complexity and capability. As teams grow and security requirements increase, migrating to certificate-based solutions becomes worthwhile.
Monitoring and Auditing SSH Access
Visibility into who accessed what and when is critical. Implement centralized logging for all SSH activity.
# Configure sshd to send logs to syslog
# /etc/ssh/sshd_config
LogLevel VERBOSE
SyslogFacility AUTH
# Log to CloudWatch or ELK stack
# Use rsyslog to forward SSH logs to central server
# /etc/rsyslog.d/30-ssh.conf
:programname, isequal, "sshd" @logs.company.com:514
& stop
Parse these logs to track:
- Failed authentication attempts (potential intrusions)
- New public keys added to systems
- Privilege escalation through
sudo su - Anomalous access patterns (logins from unusual times/locations)
For teams using Teleport, access logging is built-in and queryable. For DIY approaches, ELK stack or CloudWatch provide log aggregation and alerting.
Practical Recommendations
Start with these steps regardless of which solution you choose:
-
Audit existing keys: Identify all current SSH keys and their access levels. Search authorized_keys files across all servers and document findings in a spreadsheet.
-
Establish a key policy: Define requirements for key types (Ed25519preferred over RSA), key rotation frequency (90 days recommended), and access review cadence (quarterly).
-
Implement access groups: Organize access by team and environment rather than individual keys. Group permissions map to infrastructure layers naturally.
-
Automate provisioning: Every new developer should receive access through automation, not manual server configuration. No SSH keys added by hand.
-
Plan for offboarding: Ensure clear processes for removing access when team members transition. Immediate access revocation prevents data exfiltration risks.
-
Enable session recording: At minimum, log all SSH commands. Ideally, record full terminal sessions for audit purposes.
Migrating from Password to Key-Based Authentication
If your team currently uses password authentication, plan a careful migration to SSH keys:
- Audit current access: Document all accounts, passwords, and access levels
- Generate keys for all users: Provide instructions or automate key generation
- Deploy public keys to servers: Use configuration management during a maintenance window
- Test access: Verify users can authenticate with keys before disabling passwords
- Disable password authentication: Set
PasswordAuthentication noin sshd_config - Monitor for issues: Track login failures and support requests for a week post-migration
This phased approach prevents lockouts while improving security. Run both methods in parallel during transition.
Incident Response for Compromised Keys
When you suspect a key has been compromised:
- Immediately revoke access: Remove the compromised key from all authorized_keys files
- Rotate other keys: Generate new keys for affected users
- Audit access logs: Check what was accessed with the compromised key
- Review server activity: Look for suspicious commands or file access
- Notify affected parties: Alert users about the incident and remediation steps
- Post-mortem: Determine how compromise occurred and prevent recurrence
Having automated revocation mechanisms makes this response much faster. With centralized management like Ansible or Vault, you can revoke keys across all servers in minutes rather than hours.
The right solution depends on your team size, infrastructure maturity, and security requirements. Small teams benefit from simple Ansible-based approaches, while larger organizations should invest in certificate-based systems or managed solutions that provide audit trails and automatic rotation. Regardless of the solution, implement it with clear documentation so every team member understands the process and can respond correctly when incidents occur.
Frequently Asked Questions
Who is this article written for?
This article is written for developers, technical professionals, and power users who want practical guidance. Whether you are evaluating options or implementing a solution, the information here focuses on real-world applicability rather than theoretical overviews.
How current is the information in this article?
We update articles regularly to reflect the latest changes. However, tools and platforms evolve quickly. Always verify specific feature availability and pricing directly on the official website before making purchasing decisions.
Are there free alternatives available?
Free alternatives exist for most tool categories, though they typically come with limitations on features, usage volume, or support. Open-source options can fill some gaps if you are willing to handle setup and maintenance yourself. Evaluate whether the time savings from a paid tool justify the cost for your situation.
How do I get my team to adopt a new tool?
Start with a small pilot group of willing early adopters. Let them use it for 2-3 weeks, then gather their honest feedback. Address concerns before rolling out to the full team. Forced adoption without buy-in almost always fails.
What is the learning curve like?
Most tools discussed here can be used productively within a few hours. Mastering advanced features takes 1-2 weeks of regular use. Focus on the 20% of features that cover 80% of your needs first, then explore advanced capabilities as specific needs arise.