Part 2: Account Structure and Access Control on V-platform, the Infrastructure Platform at VTS

Pavel Susloparov
Building VTS
Published in
7 min readJun 14, 2022

--

Since its inception, VTS had used Heroku as the cloud provider solution and hosted all applications there. Heroku was a great choice at that stage as it rapidly helped us run our applications in a cloud environment with minimal infrastructure management overhead. However, with rapid growth in customers, product lines, and geographical locations, there came the point where we needed additional capabilities to accelerate product development at scale. In 2021, we boldly chose to embark on a cloud infrastructure transformation journey for growth. This journey was two-phased,

  1. We built a state-of-the-art infrastructure platform on AWS and named it The V-platform.
  2. We worked with several teams to onboard and migrate their applications from Heroku to AWS (The V-Platform).

We set an ambitious goal to complete our journey in less than a year without impacting feature development or client experience during the process.

The previous article in this series describes why we decided to migrate from Heroku to AWS. This article describes the technical details of the centralized tooling (V-Platform) we developed in-house, leveraging AWS Control Tower setup and account customization. We aimed to implement Infrastructure as Code (IaC) and set up processes to enable a fully automated, repeatable, secure, and self-serve method of provisioning new AWS accounts and associated infrastructure. We also had the vision to allow distributed access control so as to not burden the IT team with managing AWS access for our growing engineering team. Read on to learn how we achieved this!

Our Design considerations

When we started planning for the future growth and scaling on AWS, we focused on

  • developer autonomy vs. centralized control
  • application architecture standardization vs. customization
  • centralized governance over security
  • observability
  • costs

We also had to consider business growth needs like the ease of integration for the companies we were thinking of acquiring. We had several questions that needed answering such as:

  • How do we want to manage multiple AWS accounts across different product lines?
  • How do we centralize organizational infrastructure like DNS/Networking and give autonomy to teams to modify the infrastructure within their control?
  • How do we enable standard paths for application architectures and their infrastructure?
  • How do we increase developer velocity by providing easy-to-use deployment tooling, visibility into run times (like EKS), and their access?

I hope the above points gave you an insight into our requirements. We implemented several POCs and through it all, one of the critical decisions we made was to choose AWS Control Tower to manage our AWS accounts and their guardrails.

AWS Control Tower

AWS Control Tower is the offering of choice for AWS account provisioning. The tool provides the easiest way to set up and govern a secure, multi-account AWS environment. AWS Control Tower also secures every account with guardrails and applies best practices while provisioning the account within the AWS organization.

We created three AWS organizations to support SDLC for automation development:

  • Control Tower Dev
  • Control Tower Stage
  • Control Tower Production

This model helps us test AWS Control Tower accounts provisioning automation. It allows us to make and verify changes in the lower environments (dev/stage) and then promote them to the production organization.

During the AWS organization design, we decided to have two categories of accounts — base, and workload.

  • Base accounts are a set of AWS accounts that AWS Control Tower provisions by default — audit, log-archive, and our custom accounts — IAM, networking, and shared services.
  • Workload accounts are AWS accounts in which the applications and services run.

The purpose of separation is primarily ownership responsibility for these accounts.

  • The central infrastructure team fully manages base accounts.
  • Workload accounts have shared ownership between application engineers and the central infrastructure team.

One of the goals of the AWS migration project was to enable application engineers with the tools and access to manage their infrastructure and applications. Engineers should have the knowledge, access, and self-serve tools to bring value to the business. This approach serves the vision of having infrastructure ownership as a distributed model across multiple teams. It is not only a single team solely responsible for managing the company infrastructure.

This account structure helps teams work with their dedicated resources and provides a process to manage them. The following section describes the technical approach to achieving this.

IaC (Infrastructure As Code) approach

We use open-source Terraform to manage AWS infrastructure. Terraform has the concept of state and describes the representation of AWS resources at the current moment.

A combination of S3 bucket files and a set of records in a dedicated DynamoDB table manages the Terraform state and keeps track of changes over time.

IaC is an excellent approach as it gives us visibility into the infrastructure changes. It leverages GitHub history, enables knowledge and pattern sharing, and allows for accountability.

Access Model

Our access model removes the need for IT support in order to manage access to AWS accounts. This approach is a huge deal if you have ever tried to have one person or a team control access for all engineers in the company.

VTS uses automation and distributed models instead. The automation leverages Google Groups as the interface for Engineering Directors and Managers to grant and revoke access to AWS accounts for engineers on their teams.

We achieved access segregation and distributed responsibility by setting up proper access to Google Groups.

Once members are added or removed from Google Groups, they are automatically synced with Okta groups, as this is the central tool for SSO management at VTS. Okta groups are then automatically synced with AWS SSO groups. The AWS SSO groups are associated with AWS permission sets and have access policies for the related accounts within the organizations.

Account customizations

When AWS Control Tower provisions a new account, the account has guard rails set up to limit actions within the AWS account, even if the access policy has administrator privileges.

In addition, V-platform provides a set of customizations for AWS resources which are provisioned automatically by default for each account.

The customizations include:

  • Networking access to other accounts within the organization across multiple regions. It includes AWS VPC, AWS NAT Gateway, AWS Transit Gateway, AWS Transit Gateway attachments, and AWS Routes.
  • Public and private AWS Route53 hosted zones to enable public DNS for web applications and TLS communication between applications without exposing traffic to the open internet.
  • ACM to provide public and private certificates for public and private route53 hosted zones.
  • AWS KMS keys provide a way to encrypt/decrypt secrets, snapshots, and logs.

AWS Service Catalog

We have a wholly automated account and resource provisioning in the V-Platform. An engineer can fill out a form in AWS Service Catalog and push a button. This provisions a new AWS account with baseline resources so application engineers can start developing infrastructure components within minutes!

This automation creates a GitHub repository related to the new account, allowing engineers to contribute to AWS SaaS solutions by leveraging Terraform.

This automation was a vital component of a culture shift within VTS from having a central team handle all infrastructure changes, to a collaborative working model that empowers engineers to share knowledge and have end-to-end ownership of their applications.

Account customizations implementation details

AWS Control Tower uses the CloudFormation stack to deploy resources to an account. We use open-source Terraform to trigger the CloudFormation stack.

The CloudFormation stack uses AWS CodeBuild/CodeCommit to trigger multiple lambda functions to deploy ‘Terraform’ updates to all accounts in AWS organizations.

Account customization is a custom AWS Service Catalog product. The product triggers custom JavaScript, which combines CloudFormation input parameters with a Jinja template during the lambda function execution.

The result of applying parameters to the template is an automatic Pull Request to a central GitHub repository responsible for managing all the AWS accounts in code.

The GitHub repository has a GitHub Actions workflow and a separate Jinja template to create Pull Requests in 30+ repositories with account customization with open source Terraform.

With this approach, we can consistently manage all AWS accounts automatically and repeatedly. We have saved hundreds of engineering hours by providing this level of automation and autonomy to engineers across the organization.

I hope you’ve enjoyed our journey to building a world-class infrastructure platform. We will have a follow-up article describing the process and mindset changes we adopted to drive our migration project to completion. Stay tuned to learn more!

--

--