Day 13 Data Sources with AWS
30DaysOfAWSTerraform challenge, We will provision an EC2 instance into a pre-existing VPC and subnet.
I’m a Cloud & DevOps Engineer passionate about building reliable, scalable, and automated cloud infrastructures. I work extensively with AWS, Kubernetes, Terraform, Docker, and CI/CD pipelines to deliver production-ready environments.
My journey started in technical troubleshooting, where I gained strong root-cause analysis and system diagnostic skills. Transitioning into cloud engineering, I have built 3-tier microservices architectures, automated VPCs using Terraform, and containerized legacy applications for performance and portability.
I enjoy solving real-world problems, optimizing cloud cost and performance, and creating automated workflows that reduce manual effort. I’m continuously learning and applying best practices in DevOps, IaC, and cloud security.
Core Skills: AWS • Kubernetes • Docker • Terraform • CI/CD • Linux • Networking • Monitoring • Automation • Troubleshooting
Looking For: Cloud Engineer | DevOps Engineer | SRE (Junior/Mid-level) roles where I can build, automate, and scale cloud workloads.
We have a "shared" VPC and subnet that were created by another team or process. Our task is to launch a new EC2 instance into this existing network infrastructure without managing the VPC or subnet with our Terraform configuration.
Pre-existing Infrastructure
The following resources are assumed to exist in your AWS account:
VPC: with the tag
Name=shared-network-vpcSubnet: with the tag
Name=shared-primary-subnet
Terraform Configuration (main.tf)
Our Terraform code will:
Define Data Sources:
data "aws_vpc" "shared": This block tells Terraform to find a VPC with the tagNameset toshared-network-vpc.data "aws_subnet" "shared": This block finds a subnet with the tagNameset toshared-primary-subnetwithin the VPC found by the previous data source.data "aws_ami" "amazon_linux_2": This block finds the latest Amazon Linux 2 AMI to use for our EC2 instance.
Use Data Source Outputs:
The
aws_instanceresource usesdata.aws_subnet.shared.idto launch into the existing subnet.The
aws_instanceresource also usesdata.aws_ami.amazon_linux_2.idfor the AMI.
Architecture Overview
┌─────────────────────────────────────────┐
│ Pre-existing Infrastructure (Setup) │
│ ┌─────────────────────────────────┐ │
│ │ VPC: shared-network-vpc │ │
│ │ CIDR: 10.0.0.0/16 │ │
│ │ ┌──────────────────────────┐ │ │
│ │ │ Subnet: shared-primary- │ │ │
│ │ │ subnet │ │ │
│ │ │ CIDR: 10.0.1.0/24 │ │ │
│ │ └──────────────────────────┘ │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘
↓
┌──────────────────────┐
│ Data Sources Query │
│ - aws_vpc │
│ - aws_subnet │
│ - aws_ami │
└──────────────────────┘
↓
┌──────────────────────┐
│ New EC2 Instance │
│ - Uses existing VPC │
│ - Uses existing │
│ subnet │
│ - Latest AMI │
└──────────────────────┘
Key Difference: Data vs. Resource
Purpose:
Resources are the core building blocks used to define, create, manage, and delete infrastructure components within your cloud provider (e.g., AWS, Azure, GCP).
Lifecycle Management:
Terraform fully manages the lifecycle of resources. When you define a resource block, Terraform interacts with the cloud provider's API to provision, update, or destroy the corresponding infrastructure.
Example:
Creating an AWS EC2 instance, an Azure Virtual Network, or a Google Cloud Storage bucket.
Code
resource "aws_instance" "my_server" {
ami = "ami-0abcdef1234567890"
instance_type = "t2.micro"
tags = {
Name = "MyWebServer"
}
}
Terraform Data Sources:
Purpose:
Data sources are used to retrieve information about existing infrastructure components or external data that is not managed by the current Terraform configuration. They provide a read-only view of this information.
No Lifecycle Management:
Data sources do not create, modify, or destroy any infrastructure. They solely fetch data for use within your Terraform configuration.
Use Cases:
Referencing an existing resource created manually or by another Terraform configuration (e.g., an existing VPC or subnet).
Retrieving dynamic information, such as the latest AMI ID for a specific operating system.
Fetching data from external systems or APIs.
Example:
Retrieving the ID of an existing AWS VPC to deploy resources into it.
Code
data "aws_vpc" "existing_vpc" {
filter {
name = "tag:Name"
values = ["MyExistingVPC"]
}
}
Key Differences Summarized:
Action:
Resources create and manage infrastructure; data sources retrieve information about existing infrastructure or external data.
Lifecycle:
Resources are fully managed by Terraform; data sources are read-only and have no lifecycle management.
Impact:
Resources directly provision and modify cloud services; data sources provide input for resource configurations or other parts of your Terraform code.
⚠ IMPORTANT — Ensure subnet tag exists
AWS must have a subnet with tag:
Name = shared-primary-subnet
If not, your data source will return an error.

🎥 Day 09 Video ( link )
🔚 My Day 10 Takeaways
How to use
datablocks to query existing AWS resourcesFiltering resources using tags and other attributes
Referencing data source outputs in resource configurations
Best practices for working with shared infrastructure
The difference between managing resources vs. referencing them
Terraform Data Sources is extremely powerful for production-grade deployments.
🎉 End
Day 13 was one of the most advanced and useful lifecycle lessons.
Excited for Day 14! 🚀
#30DaysOfAWSTerraform #Terraform #AWS #DevOps




