Cloud-Init β€” A Visual Thinking Guide

Cloud-Init β€” A Visual Blueprint

“The first 60 seconds of an EC2 instance’s life, choreographed.”


πŸŒ… 1. The Big Picture (Mental Metaphor)

   Imagine a new hire on Day 1.

   HR gives them:                  cloud-init gives the VM:
   ─────────────                   ─────────────────────
   β€’ ID badge          ───►        β€’ hostname / SSH keys
   β€’ Laptop setup      ───►        β€’ packages installed
   β€’ Onboarding doc    ───►        β€’ user-data script
   β€’ Email/Slack       ───►        β€’ network + DNS
   β€’ Welcome lunch     ───►        β€’ final "ready" signal

   After lunch β†’ they're productive.
   After cloud-init β†’ the VM is production-ready.

Cloud-init is the universal “onboarding officer” that runs once when a cloud VM first boots, turning a generic image into your server.


πŸ—ΊοΈ 2. Where Cloud-Init Lives (System Map)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     CLOUD PROVIDER                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  AMI/Image  β”‚    β”‚  Metadata     β”‚    β”‚  User-Data     β”‚  β”‚
β”‚  β”‚ (Ubuntu 22) β”‚    β”‚  Service      β”‚    β”‚  (your script) β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚  169.254.169  β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚         β”‚           β”‚    .254       β”‚            β”‚            β”‚
β”‚         β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚            β”‚
β”‚         β–Ό                   β”‚                    β”‚            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚                      VM BOOTS                           β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚  β”‚  β”‚              cloud-init (4 stages)                 β”‚ β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key insight: cloud-init = the glue between
the immutable image ↔ the per-VM metadata ↔ your runtime config.


πŸ”„ 3. The 4 Boot Stages (Flowchart)

   Power-On
      β”‚
      β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β‘  generator      β”‚  Decide: "Do I run at all?" 
β”‚   (systemd)      β”‚  β†’ reads kernel cmdline
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   Network: ❌ not yet
β”‚ β‘‘ local          β”‚   Tasks: detect datasource, 
β”‚   cloud-init     β”‚           set hostname, fs resize
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   Network: βœ… up
β”‚ β‘’ network        β”‚   Tasks: fetch metadata + user-data
β”‚   cloud-init     β”‚           configure NTP, mounts
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   Tasks: install packages,
β”‚ β‘£ config         β”‚           write files, create users,
β”‚   cloud-config   β”‚           add SSH keys, run cmds
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   Tasks: user scripts (runcmd),
β”‚ β‘€ final          β”‚           boot finished signal,
β”‚   cloud-final    β”‚           snapshot of all logs
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό
   System ready βœ…   (~30-90 s after power-on)

Mnemonic: G-L-N-C-F β†’ “Generator, Local, Network, Config, Final”


🧱 4. Anatomy of Inputs (Layered Framework)

                     WHO TELLS THE VM WHAT TO DO?
                     ────────────────────────────

Layer 4: USER-DATA              ←  You write this
         (cloud-config YAML,        (most customization happens here)
          shell script,
          MIME multi-part)

Layer 3: VENDOR-DATA            ←  Cloud provider injects
         (defaults from AWS,        (e.g. AWS SSM Agent setup)
          Azure, GCP)

Layer 2: METADATA               ←  Read from 169.254.169.254
         (instance-id, hostname,    (immutable per-instance facts)
          AZ, IAM role, tags)

Layer 1: DATASOURCE             ←  Auto-detected
         (Ec2, Azure, GCE,          (decides where to find layers 2-4)
          NoCloud, OpenStack)

Layer 0: /etc/cloud/cloud.cfg   ←  Image author baked this in
         (which modules run,        (the "rulebook")
          in what order)

Causal chain:

cloud.cfg  β†’  picks datasource  β†’  pulls metadata + user-data
                                          β”‚
                                          β–Ό
                                    modules execute
                                    in defined order

πŸ“ 5. The Star of the Show: #cloud-config (Cheat Sheet)

#cloud-config              ← THIS MAGIC HEADER is mandatory
# ─────────────────────────────────────────────────
hostname: web-01           ← Identity
fqdn: web-01.example.com

users:                     ← Who can log in
  - name: deploy
    sudo: ALL=(ALL) NOPASSWD:ALL
    ssh_authorized_keys:
      - ssh-ed25519 AAAA...

package_update: true       ← What software
package_upgrade: true
packages:
  - nginx
  - jq

write_files:               ← What config
  - path: /etc/nginx/conf.d/app.conf
    content: |
      server { listen 80; ... }

runcmd:                    ← What to execute
  - systemctl enable --now nginx
  - curl https://example.com/register

final_message: "Booted in $UPTIME s"

Visual pattern: Identity β†’ Users β†’ Packages β†’ Files β†’ Commands β†’ Signal done


βš–οΈ 6. user-data Formats (Comparison)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Format           β”‚  Header             β”‚  When to use         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  cloud-config     β”‚  #cloud-config      β”‚  Declarative,        β”‚
β”‚  (YAML)           β”‚                     β”‚  idempotent βœ…       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Shell script     β”‚  #!/bin/bash        β”‚  Quick & dirty,      β”‚
β”‚                   β”‚                     β”‚  imperative          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  MIME multipart   β”‚  Content-Type: ...  β”‚  Combine both        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Jinja template   β”‚  ## template:jinja  β”‚  Render with         β”‚
β”‚                   β”‚  #cloud-config      β”‚  metadata vars       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Gzip / Base64    β”‚  (binary)           β”‚  Bypass 16 KB limit  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Rule of thumb:

Declarative (cloud-config) for config, imperative (script) for one-shot actions.


πŸ” 7. Idempotency & Re-Runs (Mental Model)

First boot:                  Subsequent boots:
─────────────                ──────────────────
   β–²                            β–²
   β”‚ all modules                β”‚ only "always" modules
   β”‚ execute                    β”‚ (e.g. NTP, network)
   β”‚                            β”‚ "once" modules SKIP
   β”‚ /var/lib/cloud/            β”‚  via stamp files in
   β”‚   instance/sem/            β”‚   /var/lib/cloud/sem/
   β”‚   *.once   β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   debug rerun ───► cloud-init clean β”œβ”€β”€β”€β–Ί next reboot
                  β”‚ (wipes stamps)    β”‚     re-runs everything
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Mental hook: Stamp file present? Skip. Missing? Run.


πŸ” 8. Forensics β€” Where Things Live (Map)

/etc/cloud/
β”œβ”€β”€ cloud.cfg               ← master config (modules + order)
β”œβ”€β”€ cloud.cfg.d/            ← drop-ins (vendor + user overrides)
└── templates/              ← hostname, hosts, etc.

/var/lib/cloud/
β”œβ”€β”€ instance/               ← symlink β†’ current instance
β”‚   β”œβ”€β”€ user-data.txt       ← raw user-data as received
β”‚   β”œβ”€β”€ cloud-config.txt    ← rendered cloud-config
β”‚   β”œβ”€β”€ scripts/            ← runcmd & per-boot scripts
β”‚   └── sem/                ← "I already ran" stamps
└── instances/<iid>/        ← history per instance

/var/log/
β”œβ”€β”€ cloud-init.log          ← every module call (verbose)
└── cloud-init-output.log   ← stdout/stderr of scripts

Debug recipe:

See also: Mastering the Linux Command Line β€” Your Complete Free Training Guide

cloud-init status --long       # current state
cloud-init query --all         # all known vars (metadata, ds, etc.)
sudo cloud-init schema --system   # validate config
sudo tail -f /var/log/cloud-init-output.log

⏱️ 9. Timeline: A Real Boot

0 s   ┃ Hypervisor starts VM
2 s   ┃ Kernel loaded
4 s   ┃ systemd reaches "cloud-init.target"
5 s   ┃ β‘‘ local stage   β€” hostname set
8 s   ┃ DHCP gets IP
10 s  ┃ β‘’ network stage β€” fetch user-data from
      ┃                   http://169.254.169.254/latest/user-data
15 s  ┃ β‘£ config stage  β€” apt update, install nginx
45 s  ┃ β‘€ final stage   β€” runcmd, write final_message
46 s  ┃ EC2 status check goes βœ…
        Terraform sees "instance running" and continues

🧠 10. Cause-and-Effect: Why Things Break

Symptom                              Root cause
─────────────────────────────────    ─────────────────────────────
Hostname stays "ip-10-…"      ◄───   user-data missing #cloud-config
                                     header β†’ treated as garbage

SSH keys not added            ◄───   IAM role can't read metadata,
                                     or instance in private subnet
                                     without VPCe to metadata

"runcmd" never executes       ◄───   YAML indentation error
                                     (silent skip β€” check schema!)

Re-run does nothing           ◄───   Stamp files exist;
                                     need: cloud-init clean --logs

Boot takes 5 min              ◄───   `package_upgrade: true` over
                                     slow NAT β†’ upgrade megabytes

🧬 11. Cloud-Init in Your Terraform Context

[Terraform admin-host module]
        β”‚
        β–Ό
   creates aws_instance
        β”‚
        β”‚  user_data = <<-EOF
        β”‚    #cloud-config
        β”‚    hostname: adminhost
        β”‚    write_files: …
        β”‚    runcmd: [bootstrap-ansible.sh]
        β”‚  EOF
        β–Ό
   EC2 boots from AMI:
        β”‚
        β–Ό
   cloud-init takes over:
   β‘  reads metadata (IAM role, AZ, tags)
   β‘‘ applies hostname, SSH keys
   β‘’ runs bootstrap (joins consul, registers DNS)
        β”‚
        β–Ό
   Instance is "an adminhost" βœ…

Insight: That’s why baking a minimal AMI + delegating to cloud-init
is more flexible than baking everything into the image β€” same image,
different roles across environments.


🎯 12. Mental Snapshot (One Picture to Remember Everything)

        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚            CLOUD-INIT IN ONE FRAME      β”‚
        β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
        β”‚                                         β”‚
        β”‚   Generic Image  +  Per-VM Recipe       β”‚
        β”‚        β”‚                β”‚               β”‚
        β”‚        β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
        β”‚               β–Ό                         β”‚
        β”‚      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
        β”‚      β”‚  cloud-init    β”‚ ← runs ONCE     β”‚
        │      │  G→L→N→C→F     │   (stamps it)   │
        β”‚      β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
        β”‚              β–Ό                          β”‚
        β”‚       Unique, ready VM                  β”‚
        β”‚                                         β”‚
        β”‚   Inputs:  metadata + user-data         β”‚
        β”‚   Where:   /etc/cloud + /var/lib/cloud  β”‚
        β”‚   Logs:    /var/log/cloud-init*.log     β”‚
        β”‚   Debug:   `cloud-init status --long`   β”‚
        β”‚                                         β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

One-line mantra:

“Image is the body, cloud-init is the soul that arrives on first breath.”