A template for a platform

August 22, 2019

“Kubernetes is a platform for building other platforms. It’s a better place to start; not the endgame” – Kelsey Hightower

In a previous blog post I used the above quote from Kelsey about a starting place. Now I want to show a concrete example of a Platform. I will try to show a design that can be used a product and expended upon. To make you understand I first want to show a quick architectural drawing of the platform and the reason why.

yashiro_design

In the picture above you can see that the design is rather straight forward. We have a control plane where supporting tools will run and some logging will flow to. The environments connected to it are there to run the workloads. All of the components mentioned above are Kubernetes clusters. So instead of running everything on 1 cluster we run a configuration of 1 + X.

A simple design but that is just setting it up for the full usage of the platform. Kubernetes is a great product but it cannot serve all of the backing services we need. So in this design there should also be a cloud connected to provide databases virtual networking, container registries and much more.

Product

The most important thing is that we treat the platform as a product. If we don’t then the code can turn into a big ball of mud. For example: a developer needs a database spun up for his/her service. If we put this in the platform code (IAAC). It means that there is logic there that is not part of the product. An important part is then how to structure the code or how to think about the platform and what lands on it. Because of this I’m going to introduce a level system.

The main drive behind this design will be to deploy it with a minimum amount of effort. So everything should be able to run with one click. Because of this we need to maintain a separation of concern and keep it clean. In that case you can treat it on its own and deploy it how many times you need it. As we can see in the future there are multiple uses for this. Treat the whole platform as cattle not as a pet.

Platform Engineering - SRE

Level 0: The Control plane
Level 1: Environments

App Development - DevOps

Level 2: Developer workflows

Separation of concerns

We need a level system to keep a good separation of concerns. Plus it is important to determine who is responsible for what part. If you are going to treat the platform as a product that means that a separate team will maintain and build that platform. The product that the developers are building is the actual business of the company and the platform is there to support them. It doesn’t mean that the platform team will be the new operations. The developers should maintain a devops mentality of you build it you run it.

The reason the control plane is level 0 is because that will deploy most of the elements of the platform. It will spin up it’s own kubernetes cluster and the automation tool (in this case concourse-ci). The automation tool will then deploy the rest of the environments.

In short Level 0 -> Control plane (with ci) -> ci -> level 1

Level 2

Level 2 is also deployed with the ci tool, but is seperated in a different RBAC group inside the ci tool. Level 2 is to be considered a DevOps tool which the developers can use. So it’s not part of the product and should be inside of multiple different repositories.

Level 2 is a special case because it cannot be defined in 1 codebase. It could be done in several options. I will explain why this is special. The target of a level 2 IAAC could be seen in several ways. If you want to have some Infrastructure land on the control plane which is contextual to the user then it should not go into Level 0 but a Level 2 repo. To seperate this concern we represent this with a Level 2.0 repo. If a piece of infrastructure has to land on the environments in global, so not specific to a workflow and more in a general fashion then it will land in a Level 2.1 repository. A good example of this being an api gateway like Kong. However the routes and services are specific for a microservice and that has to be contained in a Level 2.2 repo.

Level 2.0 - Platform user targeting the control plane (Level 0)
Level 2.1 - Platform user targeting the global range of the environment (Level 1)
Level 2.2 - Platform user deploying resources for a microservice for all resources
Level 2.2.{team-name} - Specific targeting repo for multiple microservice grouped by team
Level 2.2.{microservice} - Specific targeting repo per microservice
Level 2.2.*.bridge - Any of the above 2.2 repos but then containing only the multi region services

The mentioned Level 2.2 repo’s are not all mandatory. It’s better to choose one strategy for the repo’s. For example you can do the terraforming per team and leave the other 2. The bridge repository is an exception to this. If you are not deploying the platform in a multiregion fashion you can leave this one. If you do then only place resources in here that are by itself multiregion. For example azure cosmosdb can be deployed in a multiregion fashion.