A DNS migration: from F5 to R53

By Aquiles Calvo, Cloud Engineer at EDRANS.

Intro

One of the most recent challenges involved the migration of DNS zones, specifically, 70 business-critical domains (and subdomains) hosted in F5 to AWS service Route 53, without any service disruption.

F5 provides several solutions, such as the BIG-IP DNS solution. Basically, a DNS host service where you can host your domains and subdomains configuring them load balancing, failover, escalations, and so on.

As part of best practices, the cloud infrastructure had to be created using Terraform, to allow a better change management process than manually configurations via the console.

Terraform, as its website states, is an open-source infrastructure as a code software tool that enables you to safely and predictably create, modify, and improve infrastructure.

Vector de Negocios creado por fullvector

Goal

So, at the beginning of the project we only had this:

  • 70 BIND files, one for each domain to be migrated
  • F5 configurations, including weight, failover, and custom settings
  • 3 weeks of timeframe

While evaluating the situation, it quickly became obvious that writing the Terraform config by hand, replicating the BIND files, and manually applying the config was a non-viable solution for us: every file had an average of 50 lines of subdomains and configurations. So basic maths here, 3500 lines of code has to be done by hand, with no mistakes. What’s more, we had to add the White Label Records to each domain (we will get to that soon).

Challenges

It was everything we needed, BIND files directly transformed into Terraform files, ready to be deployed. But not everything is as beautiful as we wanted to be, as we had 3 problems with this approach:

  1. Tfz53 runs by individual files, so you have to specify the parameter — domain (where the BIND file is) and — zone-file (where you want the output), meaning we’d need to run this command for each domain.
  2. We had to add in every file the delegation set to use and the white label records.
  3. Some subdomains had a failover or weight configuration, which means that for example, www.edrans.com would go to 10.0.0.1 but also to 10.0.0.2 using a weighted system. Tfz53 would create a single record with 2 IPs, that would balance equally between them, therefore ignoring the custom balance.

How did we solve that?

b) For the second issue, let’s talk first about some technical details we need to address. 👇

Delegation Set and White Label Records

When you create a hosted zone in AWS Route53, you will be given 4 random Name Servers (NS from now on) provided by AWS. When you hit a website with a dig or any other DNS lookup tool, you will see these 4 NS popping up.

For example, if you run dig NS abc.com +short you will get:

AWS DNS’s
An example of running dig

The thing is, for each zone or domain you’ll have another set of 4 different NS, making the configuration for multiple domains harder to track. That’s why we use a Reusable Delegation Set, which allows you to use the same set of NS for every domain.

But that Reusable Delegation Set (which you configure when you create the zone), is still showing the AWS NS, which is not a very good practice: if you don’t want everybody to know at a glance where your DNS is hosted, maybe the name of the company or the domain name is nicer.

White Label NS solves this issue by creating 4 new type A and 4 type AAAA records that would redirect to the original NS provided by Amazon but masking that destination.

First, we created the reusable delegation set in the AWS account where the domains will be migrated, we didn’t start the migration nor deploy any terraform file at this point, just preparing for it. Once the delegation set is created, we would modify the terraform files to include the 8 White Label Records; Remember this has to be done in each file for every domain.

After deploying this terraform file, you have to manually change the values of the NS (provided by AWS) in the hosted zone for the white label records. Basically, you will change each NS provided by AWS (for example ns-2048.awsdns-64.com) to the White Label NS ( to ns1.domain.com.).

So, now that we know about some DNS nitty-gritty details, we can see an example of this on our website: if you run dig NS edranslab.com +short you will see this:

That’s nice, now nobody knows (directly, at least) where our own DNS is hosted.

c) The last issue was pretty simple to fix but a bit tedious. We had to look up in every file for the records with failovers or weighted configuration and change a bit of code. Luckily, there were only a few of them.

This is the file created by tfz53 before our modification:

resource "aws_route53_record" "www-edranslab-com-A" {
zone_id = aws_route53_zone.domain-com.zone_id
name = "www.edranslab.com."
type = "A"
ttl = "60"
records = ["10.0.0.1", "10.0.0.2"]
}

And this is how it should look like after some modifications:

resource "aws_route53_record" "www-edranslab-com-A" {
zone_id = aws_route53_zone.domain-com.zone_id
name = “www.edranslab.com."
type = “A”
ttl = “60”

weighted_routing_policy {
weight = 242
}
set_identifier = “blue”
records = [“10.0.0.1”]
}
resource “aws_route53_record” “www-edranslab-com-A-green” {
zone_id = aws_route53_zone.domain-com.zone_id
name = “www.edranslab.com."
type = “A”
ttl = “60”
weighted_routing_policy {
weight = 13
}
set_identifier = “green”
records = [“10.0.0.2”]
}

We also added the weight configuration to balance the traffic between those IP addresses.

Route53 uses base 255 as value so this should be 95% to 10.0.0.1 and 5% to 10.0.0.2. Also, you will notice the resource name has “-blue” and “-green”, this is because Terraform doesn’t allow you to have two resources with the same name, so following the blue-green methodology, we used those names, but it can be anything you want to identify the different resources.

Great! We are ready to migrate!

Pre-migration

At this point, we had every domain created in AWS with each subdomain and configuration ready to be available, identical as it was in F5.

But wait, how do we make sure that our configuration is correct? Did we really nail it while creating the Terraform files or did we miss anything? Did we configure the whitelabel records correctly? Did the client change anything while we were creating the Terraform files?

As you can imagine, manually checking each of the 70 domains and their subdomains, one by one between Route53 and F5 was not very friendly, so we created a Linux Bash script for that.

How does it work? Easy, we would ask both providers and compare their answers for each domain and subdomain. But as Route53 was not productive yet, we had to query directly the NS for each zone, using the AWS CLI.

We checked 2 main things:

  1. If every whitelabel record was correctly configured, meaning that the NS in domain.com should look like ns1.domain.com. ns2.domain.com. and so on.
  2. If every record in F5 was in fact in Route53 and if it has the same value. Also, if there was a removed record in F5 but still in Route53.

That’s how we noticed some manual changes have been made by the client and we had already outdated BIND files (and Terraform config as well). The obvious solution to this problem: download the updated BIND file, use tfz53 to create the new Terraform file, apply the changes with Terraform in Route53 and re-run the script to confirm that now everything is up-to-date.

Migration time!

  1. ns1.old-ns.com. -> OLD, F5
  2. ns2.old-ns.com. -> OLD, F5
  3. ns1.domain.com. -> NEW, AWS
  4. ns2.domain.com. -> NEW, AWS

Thanks to the White Label Records, ns1.domain.com. was redirected to AWS reusable delegation set we configured, redirecting to the correct IP where the zone is hosted.

It depends on the register, but in our case a few minutes after the modification they were already responding to the new NS, showing us the 4 of them: 2 from the F5 and 2 from Route53. Keep in mind that some domain providers can take more time to update the configuration.

We left this configuration for a week, just in case something failed but it didn’t. After this week, we replaced the remaining 2 F5 NS in the registrar with the remaining Route53 NS. And again, 0 downtime, in a few minutes it was already responding with the 4 NS records from Route53.

Conclusion

Do you want to know more about AWS, DNS, and Terraform?

Drop us an email at opportunities@edrans.com or check our Career page by clicking here.

We are an AWS Premier Consulting Partner company. Since 2009 we’ve been delivering business outcomes and we want to share our experience with you. Enjoy!