Terraform vs Ansible: Infrastructure as Code Compared 2026
Comparing Terraform vs Ansible for infrastructure as code in 2026. Idempotency, state management, provisioning vs configuration management, and enterprise workflows.
The Terraform vs Ansible comparison clarifies the fundamental distinction between two categories of infrastructure automation that are complementary rather than competitive: infrastructure provisioning versus configuration management. Terraform (by HashiCorp) excels at declaring and managing the existence of infrastructure resources — creating, modifying, and destroying cloud resources like EC2 instances, VPCs, RDS databases, and Kubernetes clusters with deterministic, idempotent state management. Ansible (by Red Hat) excels at configuring what runs inside those infrastructure resources — installing software, deploying applications, managing file contents, and orchestrating multi-server workflows via agentless SSH-based automation.
Understanding this distinction is essential before evaluating either tool in isolation.
Core Philosophy Difference
| Dimension | Terraform | Ansible |
|---|---|---|
| Primary Purpose | Infrastructure provisioning (what exists) | Configuration management (what runs) |
| Execution Model | Declarative (desired state) | Procedural + Declarative (ordered tasks) |
| State Management | Yes — maintains state file (tfstate) | No state file (mostly stateless) |
| Language | HCL (HashiCorp Configuration Language) | YAML Playbooks |
| Agent Required | No | No (agentless — uses SSH/WinRM) |
| Idempotency | Strict (plan then apply) | Generally (depends on module design) |
| Cloud Coverage | 3,000+ providers | 750+ modules |
| Windows Support | Yes (resource creation) | Yes (WinRM) |
| Secret Management | HashiCorp Vault integration | Ansible Vault |
Code Examples: Infrastructure Provisioning
Terraform: Provision AWS Infrastructure
# main.tf — Complete AWS environment provisioning
terraform {
required_version = ">= 1.6.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
backend "s3" {
bucket = "my-terraform-state"
key = "prod/network/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock"
}
}
# VPC with public/private subnet architecture
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
tags = {
Name = "${var.environment}-vpc"
Environment = var.environment
ManagedBy = "Terraform"
}
}
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = var.availability_zones[count.index]
tags = { Name = "${var.environment}-private-${count.index}" }
}
# RDS PostgreSQL (Multi-AZ for production)
resource "aws_db_instance" "postgres" {
identifier = "${var.environment}-postgres"
engine = "postgres"
engine_version = "16.3"
instance_class = var.db_instance_class
allocated_storage = 100
storage_encrypted = true
db_name = var.db_name
username = var.db_username
password = var.db_password # Use aws_secretsmanager_secret in production!
multi_az = var.environment == "production"
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 30
deletion_protection = var.environment == "production"
skip_final_snapshot = var.environment != "production"
tags = { Environment = var.environment }
}
# Plan output
output "rds_endpoint" {
value = aws_db_instance.postgres.endpoint
sensitive = false
}
Terraform's power: This declares the desired infrastructure state. Run terraform plan to preview all changes before applying. Run terraform apply to provision. Run terraform destroy to cleanly tear down everything.
Ansible: Configure Servers Post-Provisioning
# playbook.yml — Configure application on provisioned servers
---
- name: Configure Application Server
hosts: app_servers
become: yes # Run as sudo
vars:
app_user: deploy
app_port: 3000
node_version: "20.x"
tasks:
- name: Update apt package cache
apt:
update_cache: yes
cache_valid_time: 3600
- name: Install Node.js 20
shell: |
curl -fsSL https://deb.nodesource.com/setup_{{ node_version }} | bash -
apt-get install -y nodejs
args:
creates: /usr/bin/node # Skip if node already installed (idempotency)
- name: Create application user
user:
name: "{{ app_user }}"
shell: /bin/bash
create_home: yes
state: present
- name: Clone application repository
git:
repo: "https://github.com/company/app.git"
dest: "/home/{{ app_user }}/app"
version: "{{ git_branch | default('main') }}"
become_user: "{{ app_user }}"
- name: Install npm dependencies
npm:
path: "/home/{{ app_user }}/app"
state: present
become_user: "{{ app_user }}"
- name: Deploy systemd service file
template:
src: templates/app.service.j2
dest: /etc/systemd/system/app.service
mode: '0644'
notify: Restart app service
- name: Ensure app service is running
systemd:
name: app
enabled: yes
state: started
daemon_reload: yes
handlers:
- name: Restart app service
systemd:
name: app
state: restarted
Ansible's power: SSH into every host in app_servers inventory, execute tasks in order, with automatic idempotency for most built-in modules (if Node.js is already installed, skip the install step).
The Optimal Combined Workflow
┌─────────────────────────────────────────────────────────┐
│ TYPICAL DEVOPS WORKFLOW │
├─────────────────────────────────────────────────────────┤
│ Phase 1: PROVISION with Terraform │
│ → Create VPC, subnets, security groups │
│ → Create EC2 instances / EKS cluster / RDS │
│ → Create load balancers, IAM roles, S3 buckets │
│ → Output: IP addresses, endpoints, resource IDs │
├─────────────────────────────────────────────────────────┤
│ Phase 2: CONFIGURE with Ansible │
│ → Install software on new EC2 instances │
│ → Deploy application code │
│ → Configure nginx, systemd services │
│ → Bootstrap monitoring agents (Datadog, Prometheus) │
├─────────────────────────────────────────────────────────┤
│ Phase 3: DEPLOY with CI/CD Pipeline │
│ → GitHub Actions triggers Ansible on new release │
│ → Zero-downtime rolling deployment across all servers │
└─────────────────────────────────────────────────────────┘
Common Use Cases
- 1. Multi-Cloud Resource Provisioning (Terraform): Create identical staging environments in AWS and DR environments in GCP with the same Terraform code, changing only the provider block.
- 2. Application Deployment Pipelines (Ansible): GitLab CI/CD triggers Ansible playbooks on every merge to main, deploying the new application version with zero-downtime rolling restarts across 20 servers.
- 3. Patch Management (Ansible): Monthly security patching playbook runs
apt upgrade -yacross all servers in parallel, with custom pre/post check tasks to verify application health after patching. - 4. Kubernetes Cluster Management (Terraform + Helm): Terraform creates the EKS cluster; Helm (called from Terraform or Ansible) deploys application workloads via Kubernetes manifests.
- 5. Database Schema Migrations (Ansible): Ansible orchestrates database migration scripts, running them in the correct order across multiple database replicas with pre-migration backups and rollback handling.
- 6. Compliance Hardening (Ansible): CIS Benchmark hardening playbooks run on every new server, enforcing SSH configuration, file permissions, audit log settings, and sysctl kernel parameters.
Tips and Best Practices
- Store Terraform State Remotely (Not Locally): Never use local tfstate files for team environments. Use S3 + DynamoDB (AWS), GCS (GCP), or Terraform Cloud for shared state with locking. Local state files cause catastrophic conflicts in team environments where multiple engineers run Terraform simultaneously.
- Use Terraform Modules for Reusability: Extract common infrastructure patterns (VPC+subnet combos, ECS service definitions) into Terraform modules. Reference them with different variable inputs for dev/staging/production environments.
- Write Idempotent Ansible Tasks: Always use Ansible's built-in modules (
apt,user,git,template,systemd) rather than rawshellcommands where possible. Built-in modules handle idempotency automatically; shell commands require manualcreates:orwhen:guards. - Use Ansible Vault for Secrets: Never store passwords, API keys, or tokens in plaintext Ansible variable files. Use
ansible-vault encrypt_string 'my_password'to store encrypted secrets that are decrypted at runtime.
Troubleshooting
Problem: Terraform Plan Shows Unexpected Resource Replacement
Issue: terraform plan shows a resource marked for destruction and recreation when you only changed a tag.
Cause: Some resource properties are "ForceNew" — they cannot be updated in-place and require resource replacement. Changing an RDS major version, EC2 AMI, or subnet assignment triggers recreation.
Solution: Review the planned changes carefully before applying. Add lifecycle { prevent_destroy = true } to critical resources that should never be accidentally destroyed. Use terraform state rm + terraform import for resources that should be adopted rather than re-created.
Problem: Ansible Task Fails on One Host, Skips Others
Issue: An Ansible playbook fails on server3 but successfully runs on servers 1, 2, and 4.
Cause: A configuration difference on server3 (different OS version, missing dependency, disk full) causes that specific task to error.
Solution: Run with -v verbose flag to see exact error. Use --limit server3 to re-run only the failing host after fixing the root cause. Add ignore_errors: yes to non-critical tasks that should not halt the entire play on failure.
Related Tools
Pulumi
An alternative to Terraform that uses standard programming languages (Python, TypeScript, Go, .NET) instead of HCL for infrastructure definition. Ideal for teams that prefer imperative programming logic and type-safety for infrastructure code.
Chef and Puppet
Alternative configuration management tools to Ansible. Chef uses Ruby DSL (steeper learning curve); Puppet uses a declarative proprietary DSL. Both require agents on managed nodes — unlike Ansible's agentless SSH model. Both are declining in adoption versus Ansible in 2026.
Frequently Asked Questions
Should I use Terraform or Ansible to create servers?
Use Terraform and only Terraform to create, modify, and destroy infrastructure resources (servers, networks, databases). Use Ansible exclusively for configuring what runs inside those servers after Terraform provisions them. Mixing the two for resource provisioning creates state management conflicts.
Does Ansible have state like Terraform?
No. Ansible is largely stateless — it connects, executes the playbook, and exits. It has no persistent record of what it previously configured on a host. This means Ansible relies on module idempotency checks (e.g., "is this package already installed?") rather than a central state file to avoid duplicate work.
Can Ansible create cloud resources like Terraform?
Yes. Ansible has cloud plugins (aws_ec2, azure_rm, gcp_compute) that can create cloud resources. However, Ansible's cloud resource management lacks Terraform's robust state tracking, drift detection, and dependency graph resolution. For cloud provisioning, Terraform is the superior tool.
What is HCL (Terraform's language)?
HashiCorp Configuration Language (HCL) is a JSON-compatible declarative DSL designed for human-readable infrastructure definitions. Key characteristics: blocks define resource types, arguments set properties, and expressions reference other resource outputs. HCL's terraform plan dry-run preview is its most powerful feature.
Is Terraform free?
HashiCorp's BSL license change in 2023 controversially restricted Terraform for competing SaaS vendor use. Terraform remains free for most enterprise users provisioning their own infrastructure. The open-source fork OpenTofu (maintained by the Linux Foundation) provides a fully open-source Terraform-compatible alternative with identical HCL syntax.
Quick Reference Card
| Task | Use Terraform | Use Ansible |
|---|---|---|
| Create AWS VPC/subnets | ✅ | ❌ |
| Install Nginx on servers | ❌ | ✅ |
| Create RDS database | ✅ | ❌ |
| Deploy app code | ❌ | ✅ |
| Create S3 bucket | ✅ | ❌ |
| Patch OS packages monthly | ❌ | ✅ |
| Both? | EKS cluster (TF) + configure (Ansible) | — |
Summary
The Terraform vs Ansible choice is not a competition — it is a specialization decision. Use Terraform to express what infrastructure should exist as versioned, auditable code with deterministic state management and plan-before-apply safety. Use Ansible to express how that infrastructure should be configured — installing software, deploying applications, enforcing security policies — via agentless, readable YAML playbooks executable across any SSH-accessible fleet. The majority of mature DevOps pipelines use both in complementary roles: Terraform establishes the foundation, Ansible builds what runs on top of it. Treating them as alternatives misunderstands both their architectural strengths and the full problem space of modern infrastructure automation.