Part 2: Networking and Load Balancer Configuration (scaling an app using AWS's ECS - with tf)

Feb 1

Optimised highways to enable good traffic for cars.

To better context, look at Part 1 here if you haven't already.

This particular blog is a continuation of the understanding on how one can containarize, and scale an app using AWS's ECR and ECS. This particular piece focuses on the networking side of the app being deployed. A good example is, in many cases we have a staging, and production "environment"s. Each of the staging and production urls are behind the scenes actual IP addresses. This blog will attempt to demisfy how these IPs comeabout and how one can define certain resources within AWS that enable accesibility for the app instances running in production and or staging.

Particularly for a given ecs_service, we will focus on understanding the network_configuration and load_balancer arguments, as defined below in a example;

resource "aws_ecs_service" "demo-ecs-servicee" {
  name = "demo-ecs-service"
  cluster = aws_ecs_cluster.demo-ecs-cluster.id
  task_definition = aws_ecs_task_definition.demo-task-definition.arn
  launch_type = "FARGATE"
  desired_count = 2

  network_configuration {
     subnets = ["subnet-004456cbfe1000201u1"]
     assign_public_ip = true
     security_groups = [aws_security_group.demo-ecs-sg.id]
  }
  load_balancer {
     target_group_arn = aws_alb_target_group.demo-ecs-tg.arn
     container_name = "name-of-container"
     container_port = 3000
  }
}

Network Configuration

Starting with the network_configuration argument, this contains what subnet this particular service is running in. A subnet is defined within a Virtual Private Network, aka VPC

VPC

VPC is the overall encompassing resource, fully managed by AWS. VPC is definied within aws region, and then availablility zones, each zone containing a subnet. Below is a terraform definition for a vpc resource:

resource "aws_vpc" "demo-vpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_hostnames = true
}

The cidr_block argument is required for a VPC definition, CIDR helps to define the range of IPv4 (or IPv6) addresses we will be using within this VPC. The enable_dns_hostnames has a default value of false, we must set this to true to allow Route53 to do its magic.

Since a VPC is private (duh :-)), we will need to have a way it communicates with the world wide web aka the internet, and this is where the Internet Gateway resource comes into play. According to AWS docs;

An Internet Gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between our VPC and the internet.

below is the resource definition needed using terraform

resource "aws_internet_gateway" "demo-gw" {
  vpc_id = aws_vpc.demo-vpc.id
}

The vpc_id is a required argument since ties this particular gateway to our demo-vpc defined earlier. Now that we have the VPC, CIDR, and Internet Gateway components are defined, we can go ahead to create subnets. As mentioned earlier, these subnets are within the availability zones we have defined for this VPC. below is a subnet resource in AWS using terraform

resource "aws_subnet" "demo-subnet" { 
  vpc_id = aws_vpc.demo-vpc.id
  count = 1
  availability_zone = us-east-1
  cidr_block = "10.0.0.0/24"
}

Looking back at the network_configuration argument, we can now deduce where the subnet value comes from. The assign_public_ip set to true (as below) allows for our ecs_service to be accessed via the internet hence our internet gateway definition earlier as well.

network_configuration {
     subnets = [subnet-004456cbfe1000201u1]
     assign_public_ip = true
     security_groups = [aws_security_group.demo-ecs-sg-prod.id]
  }

Security Group(s)

Our next quest is to deduce where the security_groups value comes from: According to AWS docs;

A security group acts as a virtual firewall for your instance to control inbound and outbound traffic. ... Security groups act at the instance level, not the subnet level. Therefore, each instance in a subnet in your VPC can be assigned to a different set of security groups.

From the above definition, substitute instance with ecs service, since we dealing with scaling applications running as containerized services within AWS. instance applies to using ec2. Below is definition for a security group resource

resource "aws_security_group" "demo-sg" {
  name = "demo-sg"
  description = "controls access to the application load balancer"
  vpc_id = aws_vpc.demo-vpc.id

  ingress {
    from_port = 80
    protocol = "tcp"
    to_port = 80
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port = 443
    protocol = "tcp"
    to_port = 443
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port = 0
    protocol = "-1"
    to_port = 0
    cidr_blocks = ["0.0.0.0/0"]

  }
}

The fact that a Security Group acts a virtual firewall within a VPC, it makes sense for it to have some kind of traffic rules defined. These are inbound & outbound rules, otherwise shown above as ingress and egress arguments respectively in the terraform definition. Each of the rules contains/defines specific protocol to use, the to_port & from_port, plus the cidr_blocks . The example above shows a security group that only allows inbound traffic from a TCP protocol request with ports 80 or 443 only, while the outbound traffic can be through any protocol for any ports. There are cases where a security group(s) is part of an inbound rule. Here's an example definition of a security group that has an inbound rule that depends on another security group

resource "aws_security_group" "demo-sg-two" {
  name = "demo-sg-two"
  description = "adds description here"
  vpc_id = aws_vpc.demo-vpc.id
  ingress {
    from_port = 3000
    protocol = "tcp"
    to_port = 3000
    cidr_blocks = ["0.0.0.0/0"]
  }
  ingress {
  from_port = 0
  protocol = "tcp"
  to_port = 0
  security_groups = [
    aws_security_group.demo-sg.id]
  }
  egress {
    from_port = 0
    protocol = "-1"
    to_port = 0
    cidr_blocks = ["0.0.0.0/0"]

  }
}

Load Balancer

Our next argument in the ecs_service resource to deduce is the load_balancer , but first, a formal definition;

A load balancer serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple targets, such as EC2 instances, in multiple Availability Zones. This increases the availability of your application.

There's a number of different types of load balancers, our focus is an application load balancer, Below is a clear description of what an application load balancer is, according to AWS docs;

An Application Load Balancer functions at the application layer, the seventh layer of the Open Systems Interconnection (OSI) model. After the load balancer receives a request, it evaluates the listener rules in priority order to determine which rule to apply, and then selects a target from the target group for the rule action.

Since the application load balancer (alb) routes traffic according, we need to specify which subnet(s) it lives in, including any necessary security groups (virtual firewalls); as below...

resource "aws_alb" "demo-alb" {
  name = "demo-ecs-alb"
  subnets = [
    "subnet-009043409fr3344c7",
    "subnet-034rf73d4f88w880i"]
  security_groups = [
    aws_security_group.demo-ecs-sg.id]
}

ALB Listener

For the above alb to accomplish its routing purpose, it needs to have listeners that contain the rules for routing the traffic appropriately. According to AWS Docs:

A listener checks for connection requests from clients, using the protocol and port that you configure. The rules that you define for a listener determine how the load balancer routes requests to its registered targets. Each rule consists of a priority, one or more actions, and one or more conditions. When the conditions for a rule are met, then its actions are performed. You must define a default rule for each listener, and you can optionally define additional rules.

For the aws_alb resource we defined above, here's its listener;

resource "aws_alb_listener" "demo-alb-listener"   {
  load_balancer_arn = aws_alb.demo-alb.arn
  port = 80
  protocol = "HTTP"
  default_action {
    type = "redirect"
    redirect {
      status_code = "HTTP_301"
      protocol = "HTTPS"
      port = "443"
    }
  }
}

This particular listener listens on port 80 and redirects traffic to a secure HTTP connection on port 443. The load_balancer_arn ties this listener to its demo-alb defined earlier. We can now define the secure listener that the traffic above is being redirected to;

resource "aws_alb_listener" "demo-alb-secure-listener" {
  load_balancer_arn = aws_alb.demo-alb.arn
  port = 443
  protocol = "HTTPS"
  certificate_arn = aws_acm_certificate.example-cert.arn
  ssl_policy = "ELBSecurityPolicy-2016-08"
  default_action {
    type = "forward"
    target_group_arn = aws_alb_target_group.demo-tg.arn
  }
}

This secure listener, directs its traffic to a specific target group. The default_action argument contains a target_group_arn which would be our target group.

Target Group

A target group contains targets that are the actual instances or application(s) running. Each target group defined must have its target-type properly set so that all targets in that group have the same type. Since we running a containerized application as an ecs_service, each service has an IP assigned to it, hence our target type will be "ip". Below is a resource definition for a target_group

resource "aws_alb_target_group" "demo-tg" {
  name = "demo-tg"
  port = 3000
  protocol = "HTTP"
  vpc_id = module.vpc.vpc_id
  target_type = "ip"
  health_check {
    path = "/healthz"
  }
  lifecycle {
    create_before_destroy = true
  }
}

If you look closely, the port argument for the target group has a value of 3000, same exact one as defined in the load_balancer argument of the ecs_service resource for the container_port argument ( as shown below).

load_balancer {
     container_name = "name-of-container"
     container_port = 3000
     target_group_arn = aws_alb_target_group.demo-ecs-tg-prod.arn
}

Basically, within our ecs_service, the load_balancer is configured with a port our container is running on, and which target group it belongs to.

In a nutshell, we defined an ecs_service resource with its network_configuration and load_balancer arguments appropriately defined with the right security_groups and target_groups. We also looked into aws_vpc, subnets, cidr_block, and internet_gateway ; and how all these components come together to enable scaling of a containerized application running within the AWS ecosystem possible.

Below is a single tf file with all the above resources in one place

# VPC
resource "aws_vpc" "demo-vpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_hostnames = true
}

# Internet Gateway
resource "aws_internet_gateway" "demo-gw" {
  vpc_id = aws_vpc.demo-vpc.id
}

#Single Subnet
resource "aws_subnet" "demo-subnet" { 
  vpc_id = aws_vpc.demo-vpc.id
  count = 1
  availability_zone = us-east-1
  cidr_block = "10.0.0.0/24"
}

# 1st Security Group
resource "aws_security_group" "demo-sg" {
  name = "demo-sg"
  description = "controls access to the application load balancer"
  vpc_id = aws_vpc.demo-vpc.id

  ingress {
    from_port = 80
    protocol = "tcp"
    to_port = 80
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port = 443
    protocol = "tcp"
    to_port = 443
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port = 0
    protocol = "-1"
    to_port = 0
    cidr_blocks = ["0.0.0.0/0"]

  }
}

# 2nd Security Group
resource "aws_security_group" "demo-sg-two" {
  name = "demo-sg-two"
  description = "adds description here"
  vpc_id = aws_vpc.demo-vpc.id
  ingress {
    from_port = 3000
    protocol = "tcp"
    to_port = 3000
    cidr_blocks = ["0.0.0.0/0"]
  }
  ingress {
  from_port = 0
  protocol = "tcp"
  to_port = 0
  security_groups = [
    aws_security_group.demo-sg.id]
  }
  egress {
    from_port = 0
    protocol = "-1"
    to_port = 0
    cidr_blocks = ["0.0.0.0/0"]

  }
}

# Application Load Balancer
resource "aws_alb" "demo-alb" {
  name = "demo-ecs-alb"
  subnets = [
    "subnet-009043409fr3344c7",
    "subnet-034rf73d4f88w880i"]
  security_groups = [
    aws_security_group.demo-ecs-sg.id]
}

# 1st ALB Listener
resource "aws_alb_listener" "demo-alb-listener"   {
  load_balancer_arn = aws_alb.demo-alb.arn
  port = 80
  protocol = "HTTP"
  default_action {
    type = "redirect"
    redirect {
      status_code = "HTTP_301"
      protocol = "HTTPS"
      port = "443"
    }
  }
}

# Secure ALB Listener
resource "aws_alb_listener" "demo-alb-secure-listener" {
  load_balancer_arn = aws_alb.demo-alb.arn
  port = 443
  protocol = "HTTPS"
  certificate_arn = aws_acm_certificate.example-cert.arn
  ssl_policy = "ELBSecurityPolicy-2016-08"
  default_action {
    type = "forward"
    target_group_arn = aws_alb_target_group.demo-tg.arn
  }
}

# Target Group
resource "aws_alb_target_group" "demo-tg" {
  name = "demo-tg"
  port = 3000
  protocol = "HTTP"
  vpc_id = module.vpc.vpc_id
  target_type = "ip"
  health_check {
    path = "/healthz"
  }
  lifecycle {
    create_before_destroy = true
  }
}

# ECS Service
resource "aws_ecs_service" "demo-ecs-servicee" {
  name = "demo-ecs-service"
  cluster = aws_ecs_cluster.demo-ecs-cluster.id
  task_definition = aws_ecs_task_definition.demo-task-definition.arn
  launch_type = "FARGATE"
  desired_count = 2

  network_configuration {
     subnets = ["subnet-004456cbfe1000201u1"]
     assign_public_ip = true
     security_groups = [aws_security_group.demo-ecs-sg.id]
  }
  load_balancer {
     target_group_arn = aws_alb_target_group.demo-ecs-tg.arn
     container_name = "name-of-container"
     container_port = 3000
  }
}

Mihigo Rugamba