I have been using Kubernetes in production for a while now, and I would like to record my own mental model and understanding of it in this article, and maybe talk a little bit about my experience with it.
Learning Kubernetes is kind of like learning how to ride a bike. At first, it’s extremely complicated, but after you get the hang of it, it becomes extremely powerful and can get you pretty far along the DevOps path. Not sure if that analogy made any sense, but I’ll just stick with it for now.
I believe the first hurdle to learning Kubernetes is in understanding the theory behind it. There are many books and courses out there that go into much greater detail, but my goal for this article is to distill the information into a concise overview of the purpose and high level architecture of Kubernetes, enough that hopefully even a non-dev would understand. I have on occasion received questions about what Kubernetes is from project managers and other business people. And I recall somewhat struggling to provide a clear answer. So hopefully, writing this will help clear up my own understanding of this new and increasingly popular technology.
As a disclaimer, I do not claim to be a pro on this topic, but I do have experience setting up and managing Kubernetes clusters on GCP/GKE for real clients for over two years. Previously, my experience consisted of deploying websites to standard web hosting services like DreamHost and PaaS like Heroku. With that said, let’s start with “What is Kubernetes”.
Kubernetes is “Container Orchestration”. What does that mean exactly? Well, Kubernetes is a tool that allows you to describe the desired state of your infrastructure in configuration files (aka manifest files). With these files, you can then tell Kubernetes to go and make it happen. I write the configuration files which specifies the containers I want to run and how many containers I want to run. I send it off to Kubernetes and it magically schedules the containers to start running on my cluster, so long as I have enough resources (CPU and memory) to run it.
How does the magic happen? Luckily for us, most of this has been abstracted away by the creators of Kubernetes, and it’s not something that we really need to know to get started. Although, it is probably good to know. In a nutshell, the Kubernetes master server talks to all of the connected nodes which have a kubelet process running on them. That, to be honest, is about as much as I know. I have never needed to dig into the internals, and have gotten this far.
I forgot to mention earlier to explain the terminology of cluster. This word gets thrown around a lot when speaking about Kubernetes, so I figured it might be useful to define it here. Cluster is simply a group of nodes that are connected to the same Kubernetes network. A node is simply a compute instance. And a compute instance is simply a unit of compute resources (CPU and memory) that you use to run apps, services, or whatever else you might want to run in the cloud.
So when we send our configuration to Kubernetes, and tell it “Hey, Kubernetes master, I want this backend-service container running!”, essentially what is happening is the Kubernetes master will send a message to a node with available resources to spin up a “Pod” to run your “backend-service” container. The pod will go ahead and pull the image from a container registry and spin up a docker container. Assuming everything goes well, the pod reports back to the master with a status update saying everything is good to go. Then, you as the developer can see the result of the deployment by issuing a `kubectl get pod` command. If it says “running”, then you know the deployment succeeded. The status will always be up to date, given Kubernete’s health check mechanism. So you’ll always know whether a pod is alive or dead.
Deployments, Pods, and Services
Deployments is another word that also gets thrown around a lot, along with services, and pods. Fortunately, these are the three main Kubernetes objects that you’ll need to get familiar with. You can think of a deployment as a set of instructions for Kubernetes to deploy a specific container to your cluster. A pod is a unit of deployment, which typically runs a single container on it. Sometimes, there can be more than one container, like in the case of using Istio sidecar container, but that is outside the scope of this article. And a service is what you use to expose your services to your network.
As long as a deployment for a container exists, Kubernetes will work to make sure that container is always up and running. If the pod with the container ever fails, Kubernetes will make sure to restart it. This brings us to Kubernete’s self-healing capability. If you delete a pod on purpose, it will automatically respawn itself as long as the deployment for it exists. The analogy I use here is that a deployment is like a magical scroll (the manifest file) that summons a minion (the pod) to help you achieve your mission. The number of minions (pods) that gets summoned is defined by the replica option in the file.
Ingress and Istio
So once we’ve exposed the service to our private network, how do we get that exposed to the world? For this, we introduce the ingress. The ingress allows traffic from the outside world to flow through to reach our services. But first it must pass through the ingress. The ingress routes the traffic, based on the host and path to our services. And as far as I know, the best practice is to use a third party solution for the ingress functionality. Initially, I started with ingress-nginx, but have transitioned to Istio, a full service mesh solution, with powerful traffic management, monitoring, and tracing features. It also seems like Istio is growing in popularity, alongside Kubernetes. It is even included as an option during the GKE installation process, which is evidence of its adoption in the community.
Hopefully, this text diagram will suffice in demonstrating the request flow:
Traffic -> Gateway -> Virtual Service -> Service -> Pod
It might also be worthy to note that when Istio is installed on your Kubernetes cluster on GKE, the gateway is automatically bound to a GCP load balancer and assigned an IP address. You can also assign a static IP, which I may write about in a future post. The Gateway and Virtual Service are specific features of Istio, but for all intents and purposes, it is what provides the ingress functionality in this example. The analogy I like to use for the ingress is that it is very similar to an application router. Think of a React router or a Ruby on Rails router. Essentially, the router is responsible for accepting requests, which is usually represented by a URL, and then routing it to the proper place, whether it be a controller action or a page, and then returning the response.
So why do we have to go through all this in order to get a simple app running? How is this simpler than running `git push heroku master`? I certainly have asked myself this when I started using K8s. And I believe it just comes down to your project requirements. Do you have a need for a microservice architecture where containers are able to speak to each other within your private network? Or do you want granular control over the resource allocation for each service running in your infrastructure? Or does your project require the orchestration of multiple containers running simultaneously in your cluster? Do you need to be able to add more compute resources to your cluster on demand or autoscaling? Or perhaps you need to “scale your infrastructure without scaling your DevOps team”? If you answered yes any of these, then it’s probably a good choice. In general, Kubernetes is probably more suitable for big complex projects. For hobby projects, I’d stick with Heroku or DreamHost.
I hope that helps to clarify some of the concepts around K8s and Istio, and didn’t add to the confusion. Please let me know if there is anything incorrect or if there’s a better way to think about these concepts. I am still learning about this topic and would love to hear what you think.