Skip to main content

Command Palette

Search for a command to run...

Day 13 Data Sources with AWS

30DaysOfAWSTerraform challenge, We will provision an EC2 instance into a pre-existing VPC and subnet.

Updated
4 min read
N

I’m a Cloud & DevOps Engineer passionate about building reliable, scalable, and automated cloud infrastructures. I work extensively with AWS, Kubernetes, Terraform, Docker, and CI/CD pipelines to deliver production-ready environments.

My journey started in technical troubleshooting, where I gained strong root-cause analysis and system diagnostic skills. Transitioning into cloud engineering, I have built 3-tier microservices architectures, automated VPCs using Terraform, and containerized legacy applications for performance and portability.

I enjoy solving real-world problems, optimizing cloud cost and performance, and creating automated workflows that reduce manual effort. I’m continuously learning and applying best practices in DevOps, IaC, and cloud security.

Core Skills: AWS • Kubernetes • Docker • Terraform • CI/CD • Linux • Networking • Monitoring • Automation • Troubleshooting

Looking For: Cloud Engineer | DevOps Engineer | SRE (Junior/Mid-level) roles where I can build, automate, and scale cloud workloads.

We have a "shared" VPC and subnet that were created by another team or process. Our task is to launch a new EC2 instance into this existing network infrastructure without managing the VPC or subnet with our Terraform configuration.

Pre-existing Infrastructure

The following resources are assumed to exist in your AWS account:

  • VPC: with the tag Name = shared-network-vpc

  • Subnet: with the tag Name = shared-primary-subnet

Terraform Configuration (main.tf)

Our Terraform code will:

  1. Define Data Sources:

    • data "aws_vpc" "shared": This block tells Terraform to find a VPC with the tag Name set to shared-network-vpc.

    • data "aws_subnet" "shared": This block finds a subnet with the tag Name set to shared-primary-subnet within the VPC found by the previous data source.

    • data "aws_ami" "amazon_linux_2": This block finds the latest Amazon Linux 2 AMI to use for our EC2 instance.

  2. Use Data Source Outputs:

Architecture Overview

┌─────────────────────────────────────────┐
│  Pre-existing Infrastructure (Setup)   │
│  ┌─────────────────────────────────┐   │
│  │ VPC: shared-network-vpc         │   │
│  │ CIDR: 10.0.0.0/16               │   │
│  │  ┌──────────────────────────┐   │   │
│  │  │ Subnet: shared-primary-  │   │   │
│  │  │ subnet                    │   │   │
│  │  │ CIDR: 10.0.1.0/24        │   │   │
│  │  └──────────────────────────┘   │   │
│  └─────────────────────────────────┘   │
└─────────────────────────────────────────┘
                    ↓
         ┌──────────────────────┐
         │  Data Sources Query  │
         │  - aws_vpc           │
         │  - aws_subnet        │
         │  - aws_ami           │
         └──────────────────────┘
                    ↓
         ┌──────────────────────┐
         │  New EC2 Instance    │
         │  - Uses existing VPC │
         │  - Uses existing     │
         │    subnet            │
         │  - Latest AMI        │
         └──────────────────────┘

Key Difference: Data vs. Resource


  • Purpose:

    Resources are the core building blocks used to define, create, manage, and delete infrastructure components within your cloud provider (e.g., AWS, Azure, GCP).

  • Lifecycle Management:

    Terraform fully manages the lifecycle of resources. When you define a resource block, Terraform interacts with the cloud provider's API to provision, update, or destroy the corresponding infrastructure.

  • Example:

    Creating an AWS EC2 instance, an Azure Virtual Network, or a Google Cloud Storage bucket.

Code

resource "aws_instance" "my_server" {
  ami           = "ami-0abcdef1234567890"
  instance_type = "t2.micro"
  tags = {
    Name = "MyWebServer"
  }
}

Terraform Data Sources:

  • Purpose:

    Data sources are used to retrieve information about existing infrastructure components or external data that is not managed by the current Terraform configuration. They provide a read-only view of this information.

  • No Lifecycle Management:

    Data sources do not create, modify, or destroy any infrastructure. They solely fetch data for use within your Terraform configuration.

  • Use Cases:

    • Referencing an existing resource created manually or by another Terraform configuration (e.g., an existing VPC or subnet).

    • Retrieving dynamic information, such as the latest AMI ID for a specific operating system.

    • Fetching data from external systems or APIs.

  • Example:

    Retrieving the ID of an existing AWS VPC to deploy resources into it.

Code

data "aws_vpc" "existing_vpc" {
  filter {
    name   = "tag:Name"
    values = ["MyExistingVPC"]
  }
}

Key Differences Summarized:

  • Action:

    Resources create and manage infrastructure; data sources retrieve information about existing infrastructure or external data.

  • Lifecycle:

    Resources are fully managed by Terraform; data sources are read-only and have no lifecycle management.

  • Impact:

    Resources directly provision and modify cloud services; data sources provide input for resource configurations or other parts of your Terraform code.

⚠ IMPORTANT — Ensure subnet tag exists

AWS must have a subnet with tag:

Name = shared-primary-subnet

If not, your data source will return an error.

🎥 Day 09 Video ( link )

🔚 My Day 10 Takeaways

    1. How to use data blocks to query existing AWS resources

      1. Filtering resources using tags and other attributes

      2. Referencing data source outputs in resource configurations

      3. Best practices for working with shared infrastructure

      4. The difference between managing resources vs. referencing them

  • Terraform Data Sources is extremely powerful for production-grade deployments.


🎉 End

Day 13 was one of the most advanced and useful lifecycle lessons.
Excited for Day 14! 🚀

#30DaysOfAWSTerraform #Terraform #AWS #DevOps