The Ansible Playbook I Run on Every New Server
DEV Community Grade 10 2h ago

The Ansible Playbook I Run on Every New Server

I've written before about the checklist I run through on every new server: a non-root user, key-only SSH, a default-deny firewall, Fail2ban, unattended upgrades. Doing that by hand takes under an hour and isn't hard. But "under an hour, by hand, for every server" adds up fast once you're managing more than two or three of them, and manual steps are exactly where small inconsistencies creep in, one server gets MaxAuthTries 3 and another doesn't, because whoever set it up that day was in a hurry. The fix is turning the checklist into a playbook. Same steps, same order, every time, and idempotent enough that running it again on a server that's already configured changes nothing. The playbook --- - name : Baseline hardening for a new server hosts : new_servers become : true vars : admin_user : deploy ssh_public_key : " {{ lookup('file', '~/.ssh/id_ed25519.pub') }}" tasks : - name : Create admin user user : name : " {{ admin_user }}" groups : sudo shell : /bin/bash create_home : true - name : Add SSH key for admin user authorized_key : user : " {{ admin_user }}" key : " {{ ssh_public_key }}" - name : Harden SSH config (overrides cloud-init defaults) copy : dest : /etc/ssh/sshd_config.d/99-hardening.conf content : | PermitRootLogin no PasswordAuthentication no MaxAuthTries 3 mode : ' 0644' notify : restart sshd - name : Install UFW and Fail2ban apt : name : [ ufw , fail2ban ] state : present update_cache : true - name : Configure UFW defaults ufw : direction : " {{ item.direction }}" policy : " {{ item.policy }}" loop : - { direction : incoming , policy : deny } - { direction : outgoing , policy : allow } - name : Allow required ports ufw : rule : allow port : " {{ item }}" proto : tcp loop : [ ' 22' , ' 80' , ' 443' ] - name : Enable UFW ufw : state : enabled - name : Enable Fail2ban systemd : name : fail2ban enabled : true state : started - name : Install unattended-upgrades apt : name : unattended-upgrades state : present handlers : - name : restart sshd systemd : name : ssh state : restarted Running it A fresh server gets added to the inventory and the playbook runs against just that host: ansible-playbook -i inventory.ini baseline.yml -l new-server-01 -u root The first run must connect as root — it's the only user available on a fresh server. Once the playbook completes, root login is disabled and deploy is your entry point for every run after: ansible-playbook -i inventory.ini baseline.yml -l new-server-01 -u deploy The first run does all the work, creates the user, locks down SSH, sets up the firewall. The handler only restarts sshd if the config actually changed, so the very first run is the only one where that happens, every run after is a no-op confirmation that nothing has drifted. Why idempotency is the actual point The real value here isn't the time saved on day one, typing the commands manually isn't slow. It's that six months later, when I'm not sure whether a particular server got the full treatment or was set up in a rush during an incident, I can run the playbook again and find out. If everything's already in place, Ansible reports zero changes and I move on. If something's missing, it gets fixed on the spot, with no need to remember which of the four or five manual steps was skipped. This playbook is intentionally small. It doesn't install application stacks or configure anything project-specific, it's the floor every server stands on before anything else gets layered on top. Keeping it separate from application playbooks means it stays stable, and a baseline that doesn't change often is one you can trust without re-reading it every time. Originally published at irfanmiral.com Need help with your infrastructure? See my services or get in touch .

I've written before about the checklist I run through on every new server: a non-root user, key-only SSH, a default-deny firewall, Fail2ban, unattended upgrades. Doing that by hand takes under an hour and isn't hard. But "under an hour, by hand, for every server" adds up fast once you're managing more than two or three of them, and manual steps are exactly where small inconsistencies creep in, one server gets MaxAuthTries 3 and another doesn't, because whoever set it up that day was in a hurry. The fix is turning the checklist into a playbook. Same steps, same order, every time, and idempotent enough that running it again on a server that's already configured changes nothing. The playbook --- - name: Baseline hardening for a new server hosts: new_servers become: true vars: admin_user: deploy ssh_public_key: "{{ lookup('file', '~/.ssh/id_ed25519.pub') }}" tasks: - name: Create admin user user: name: "{{ admin_user }}" groups: sudo shell: /bin/bash create_home: true - name: Add SSH key for admin user authorized_key: user: "{{ admin_user }}" key: "{{ ssh_public_key }}" - name: Harden SSH config (overrides cloud-init defaults) copy: dest: /etc/ssh/sshd_config.d/99-hardening.conf content: | PermitRootLogin no PasswordAuthentication no MaxAuthTries 3 mode: '0644' notify: restart sshd - name: Install UFW and Fail2ban apt: name: [ufw, fail2ban] state: present update_cache: true - name: Configure UFW defaults ufw: direction: "{{ item.direction }}" policy: "{{ item.policy }}" loop: - { direction: incoming, policy: deny } - { direction: outgoing, policy: allow } - name: Allow required ports ufw: rule: allow port: "{{ item }}" proto: tcp loop: ['22', '80', '443'] - name: Enable UFW ufw: state: enabled - name: Enable Fail2ban systemd: name: fail2ban enabled: true state: started - name: Install unattended-upgrades apt: name: unattended-upgrades state: present handlers: - name: restart sshd systemd: name: ssh state: restarted Running it A fresh server gets added to the inventory and the playbook runs against just that host: ansible-playbook -i inventory.ini baseline.yml -l new-server-01 -u root The first run must connect as root — it's the only user available on a fresh server. Once the playbook completes, root login is disabled and deploy is your entry point for every run after: ansible-playbook -i inventory.ini baseline.yml -l new-server-01 -u deploy The first run does all the work, creates the user, locks down SSH, sets up the firewall. The handler only restarts sshd if the config actually changed, so the very first run is the only one where that happens, every run after is a no-op confirmation that nothing has drifted. Why idempotency is the actual point The real value here isn't the time saved on day one, typing the commands manually isn't slow. It's that six months later, when I'm not sure whether a particular server got the full treatment or was set up in a rush during an incident, I can run the playbook again and find out. If everything's already in place, Ansible reports zero changes and I move on. If something's missing, it gets fixed on the spot, with no need to remember which of the four or five manual steps was skipped. This playbook is intentionally small. It doesn't install application stacks or configure anything project-specific, it's the floor every server stands on before anything else gets layered on top. Keeping it separate from application playbooks means it stays stable, and a baseline that doesn't change often is one you can trust without re-reading it every time. Originally published at irfanmiral.com Need help with your infrastructure? See my services or get in touch. Top comments (0)

Comments

No comments yet. Start the discussion.