We have a "shared" VPC and subnet that were created by another team or process. Our task is to launch a new EC2 instance into this existing network infrastructure without managing the VPC or subnet with our Terraform configuration.

Pre-existing Infrastructure

The following resources are assumed to exist in your AWS account:

VPC: with the tag Name = shared-network-vpc
Subnet: with the tag Name = shared-primary-subnet

Terraform Configuration (`main.tf`)

Our Terraform code will:

Define Data Sources:
- data "aws_vpc" "shared": This block tells Terraform to find a VPC with the tag Name set to shared-network-vpc.
- data "aws_subnet" "shared": This block finds a subnet with the tag Name set to shared-primary-subnet within the VPC found by the previous data source.
- data "aws_ami" "amazon_linux_2": This block finds the latest Amazon Linux 2 AMI to use for our EC2 instance.
Use Data Source Outputs:
- The aws_instance resource uses data.aws_subnet.shared.id to launch into the existing subnet.
- The aws_instance resource also uses data.aws_ami.amazon_linux_2.id for the AMI.

Architecture Overview

┌─────────────────────────────────────────┐
│  Pre-existing Infrastructure (Setup)   │
│  ┌─────────────────────────────────┐   │
│  │ VPC: shared-network-vpc         │   │
│  │ CIDR: 10.0.0.0/16               │   │
│  │  ┌──────────────────────────┐   │   │
│  │  │ Subnet: shared-primary-  │   │   │
│  │  │ subnet                    │   │   │
│  │  │ CIDR: 10.0.1.0/24        │   │   │
│  │  └──────────────────────────┘   │   │
│  └─────────────────────────────────┘   │
└─────────────────────────────────────────┘
                    ↓
         ┌──────────────────────┐
         │  Data Sources Query  │
         │  - aws_vpc           │
         │  - aws_subnet        │
         │  - aws_ami           │
         └──────────────────────┘
                    ↓
         ┌──────────────────────┐
         │  New EC2 Instance    │
         │  - Uses existing VPC │
         │  - Uses existing     │
         │    subnet            │
         │  - Latest AMI        │
         └──────────────────────┘

Key Difference: Data vs. Resource

Purpose:

Resources are the core building blocks used to define, create, manage, and delete infrastructure components within your cloud provider (e.g., AWS, Azure, GCP).
Lifecycle Management:

Terraform fully manages the lifecycle of resources. When you define a resource block, Terraform interacts with the cloud provider's API to provision, update, or destroy the corresponding infrastructure.
Example:

Creating an AWS EC2 instance, an Azure Virtual Network, or a Google Cloud Storage bucket.

Code

resource "aws_instance" "my_server" {
  ami           = "ami-0abcdef1234567890"
  instance_type = "t2.micro"
  tags = {
    Name = "MyWebServer"
  }
}

Terraform Data Sources:

Purpose:

Data sources are used to retrieve information about existing infrastructure components or external data that is not managed by the current Terraform configuration. They provide a read-only view of this information.
No Lifecycle Management:

Data sources do not create, modify, or destroy any infrastructure. They solely fetch data for use within your Terraform configuration.
Use Cases:
- Referencing an existing resource created manually or by another Terraform configuration (e.g., an existing VPC or subnet).
- Retrieving dynamic information, such as the latest AMI ID for a specific operating system.
- Fetching data from external systems or APIs.
Example:

Retrieving the ID of an existing AWS VPC to deploy resources into it.

Code

data "aws_vpc" "existing_vpc" {
  filter {
    name   = "tag:Name"
    values = ["MyExistingVPC"]
  }
}

Key Differences Summarized:

Action:

Resources create and manage infrastructure; data sources retrieve information about existing infrastructure or external data.
Lifecycle:

Resources are fully managed by Terraform; data sources are read-only and have no lifecycle management.
Impact:

Resources directly provision and modify cloud services; data sources provide input for resource configurations or other parts of your Terraform code.

⚠ IMPORTANT — Ensure subnet tag exists

AWS must have a subnet with tag:

Name = shared-primary-subnet

If not, your data source will return an error.

🎥 Day 09 Video ( link )

🔚 My Day 10 Takeaways

1. How to use data blocks to query existing AWS resources
  1. Filtering resources using tags and other attributes
  2. Referencing data source outputs in resource configurations
  3. Best practices for working with shared infrastructure
  4. The difference between managing resources vs. referencing them
Terraform Data Sources is extremely powerful for production-grade deployments.

🎉 End

Day 13 was one of the most advanced and useful lifecycle lessons.
Excited for Day 14! 🚀

#30DaysOfAWSTerraform #Terraform #AWS #DevOps

Day 13 Data Sources with AWS

Pre-existing Infrastructure

Terraform Configuration (`main.tf`)

Architecture Overview

Key Difference: Data vs. Resource

⚠ IMPORTANT — Ensure subnet tag exists

🎥 Day 09 Video ( link )

🎉 End

Comments

More from this blog

🚀 Building a Production-Grade Event-Driven Microservices System with RabbitMQ, NestJS & Docker

From CRUD to Production: Building an Async Microservices System with NestJS, Redis & Docker

From CRUD to Real Systems: Building an Async Backend with Queues 🚀

🚀 From Chaos to Clarity: My Deep Dive into Dockerizing a Laravel App (BookStack)

Stop Wasting Resources: Why vcluster is the Future of Local Development.

Command Palette

Pre-existing Infrastructure

Terraform Configuration (main.tf)

Architecture Overview

Key Difference: Data vs. Resource

⚠ IMPORTANT — Ensure subnet tag exists

🎥 Day 09 Video ( link )

🎉 End

Comments

More from this blog

Terraform Configuration (`main.tf`)