An introduction to the most widely utilized open-source container orchestrator, its achievements and significance through case studies such as OpenAI and Spotify, and a repository for creating your first K8s cluster.

Fig 1. Image Source: Storage on Kubernetes

Table of Content

A Brief History of Deployments.
Kubernetes.
Features of Kubernetes
Components of Kubernetes.
Control Plane.
Worker Nodes.
Basic K8s Objects in a Kubernetes Cluster.
Kinds: Deployment.
Kind: Service.
Kind: HorizontalPodAutoscaler.
Kind: ConfigMaps.
Case study: OpenAI.
Case study: Spotify.
Repository: Creating and Deploying to a Local K8s Cluster
Conclusion.
References.
Additional Reading (Optional).

Pre-Requisites

One must have a decent knowledge of container images and containers, especially Docker containers and deployments.

Here are a few resources to get one started on these:

If you do, skip to the section A Brief History of Deployments.

A Brief History of Deployments

Fig 2. Source: Kubernetes Documentation

Containers are a game-changer in modern software development and deployment. They provide a lightweight, portable, and scalable platform, allowing for agile application creation and deployment, continuous development and integration, environmental consistency, and efficient resource utilization and isolation. With the benefits of containers, organizations can improve their software development and deployment practices, enabling faster time to market, higher application performance, and better resource utilization.

Imagine managing hundreds of containers deployed in production. Working with multiple components, from setting them up and monitoring them to fixing issues manually if and when they fail, is an immense task. A few examples are:

Restarting a container (Docker) when it fails.
Balance load (Traffic) between several containers that provide the same service.
Rollouts and Rollbacks.
Resource allocation per service.

Kubernetes

Kubernetes is a powerful tool for managing containerized applications in production environments. It is a portable, extensible, open-source platform for managing containerized workloads and services.

The beauty of Kubernetes is that an organization can conveniently create and use clusters on either cloud (e.g. GCP, AWS or Azure) or on-premises. This flexibility is what makes learning and using this tool worth it.

With Kubernetes, you can describe the desired state for your deployed containers while it handles scaling and failover and ensures that your deployments are stable.

Features of Kubernetes

It helps manage containers in a production environment by providing features such as:

Service discovery and load balancing: Containers deployed in a cluster can be accessed through the DNS or their IP address; furthermore, it can load balance and distribute network traffic automatically.
Storage orchestration: Kubernetes allows you to automatically mount a storage system such as local volume, public cloud providers, and more.
Automated rollouts and rollbacks: Consider rolling out changes to a deployed application. Through Kubernetes, one can set a desired state (version of the deployment the user wants) and transform the current state to the desired one at a constant, user-defined rate while ensuring zero downtime. On the other hand, if one wishes to restore the previous state, the process is the same for rolling back changes.
Automatic bin packing: One provides Kubernetes with a cluster of nodes that it can use to run containerized tasks. The user can specify resource (CPU and memory (RAM)) allocation for each container. This prevents over-allocating resources to one specific container.
Self-healing: Kubernetes restarts containers that fail, replaces them, and kills containers that don’t respond to a user-defined health check. This means that the deployment will always try to achieve the user-defined state.
Secret and configuration management: Sensitive information, like passwords, OAuth tokens, and configurations, are typically included in deployed containers. With Kubernetes, you can conveniently deploy and update these separately, saving you the hassle of rebuilding entire Docker images from scratch whenever the configuration changes.

Major Components of Kubernetes

The components shown in Fig 3 are the basics and foundation of every Kubernetes cluster. The following section showcases each element and its purpose, followed by a summary of how the flow would look like.

The cluster is divided into two major parts:

Control Plane
Worker Nodes

Control Plane

The control plane is the orchestration layer through which one

Defines, deploys and manages the lifecycle of containers.

The control plane in Kubernetes acts as the brain of the cluster, making high-level decisions and managing resources.

The Worker nodes in Kubernetes serve as the hands-on execution units, running containers and handling application workloads.

From the figure above, it can be observed that multiple/stacked control plane nodes can be created along with their supporting component except the etcd. This is creating/setting up a “Highly Available Kubernetes Cluster”. However, for most applications, setting up a single node within the control plane is sufficient and eliminates the need for additional nodes.

The control plane is comprised of the following components.

API Server.
Controller Manager.
Scheduler.
etcd.
Cloud Controller Manager (Optional).

API server

The API server is the front end for the Kubernetes control plane and thus serves as an entry point to the Kubernetes cluster.

The cluster can only be interacted with through this component.

This component is accessible through:

UI (Kubernetes Dashboard)

APIs (Through scripts or automating technologies).

Note: I’ve never used this personally and saw it mentioned in a resource I’ll share at the end.

CLI (kubectl). This is the one which is the most powerful and commonly used.

Controller Manager

Several controllers run in the control plane. To reduce complexity, these are compiled into a single binary and run in a single process.

The main task of the controller manager is to

Watch what’s happening in the cluster and respond accordingly.

For example, when a container dies within a node, the controller manager instructs the appropriate controller to restart it.

A few examples of the controllers are:

Node controller -> Monitors nodes within the cluster.
Job controller -> Watches for job/one-off tasks and creates pods (more detail on pods in the following section) to execute them.

Scheduler

Pods, when created, are not assigned to a node upon creation. This component:

Automates the task of selecting a worker node for a Pod and assigning it.

The scheduler examines each worker node, considering its overall capacity and utilization. From this set of nodes, it selects the one with the optimal availability as the target for placing the incoming Pod.

The selection criteria, however, also include individual and collective resource requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, and deadlines.

Personally, that’s a huge workload off my plate. Instead of spending time on these, I can sip tea while Kubernetes does this.

etcd

A consistent and highly available key-value store that

Holds the records of the cluster at any given point in time.

This serves as the backup of all our cluster data with a catch; it exists within the cluster. In case of a disaster, such as losing all control plane nodes, the etcd would be irrecoverable. Therefore, one ought to create a backup of this data

The following figure shows how to save this data and what a ‘snapshot’ of the etcd looks like.

Cloud Controller Manager

This manager follows the same architecture as the Controller Manager, with the change being a different set of controllers it manages. By default, Kubernetes clusters are meant to be run on-premises, so the Controller Manager we saw in this article is a core component.

This links our cluster with the cloud provider’s API and decouples cluster components from the cloud components

This component is optional unless one uses cloud service providers (e.g. Amazon, Google, Azure).

For example, if one uses cloud services, the worker nodes will be created on the cloud. Here, the Node Controller (different from the one in the other Controller Manager) will check the cloud provider if a node has been deleted in the cloud when it stops responding. There are other controllers as well:

Route Controller is used for setting up routes in the underlying cloud infrastructure.
Service Controller is meant for creating, updating and deleting cloud provider load balancers.

The following figure describes the AWS Cloud Controller Manager.

Worker Nodes

These nodes host our containers and the necessary components to ensure seamless operation. There are two main components:

Kubelet.
Kube-proxy.
Pods.

Kubelet

This agent is unique to each worker node whose job is to ensure the following:

Containers described in the PodSpecs are running and healthy.

This component only manages the Pods created by Kubernetes (we’ll see what this and PodSpecs refer to in an example).

Kube-proxy

An important concept unique to Kubernetes is “Service”. Service is defined as:

A way to expose an application running on a set of Pods as a network service.

The only way to interact with containers running in Pods on Kubernetes is if we define an “External service” or expose a service connecting the port of our service to that of our container.

This component maintains network rules on the node and thus allows communication between our Pods and network sessions that exist either inside or outside our cluster.

Pods

According to Oxford Languages,

A small herd or school of marine animals, especially whales.

The following is an extract beautifully put in the official documentation:

Pods are the smallest deployable units of computing that you can create and manage in Kubernetes.

An Important point here is that a Pod is not a container but rather a wrapper inside which one/more than one container exists. K8s manages the Pods rather than the containers.

Pods can be used in 2 ways:

One Container Per Pod: This is the most common & popular approach. In this, each Pod runs a single container that encapsulates a specific process or application. This approach offers simplicity and isolation, making it ideal for running simple applications that don’t require complex interactions between different services.
Multiple Containers Per Pod: This approach is ideal for situations where several processes or services work together to achieve a common goal. With this, one can run multiple containers within the same Pod, and they share the same network namespace, allowing them to communicate with each other easily.

Important Notes:

Grouping multiple co-located and co-managed containers in a single Pod is a relatively advanced use case. It is preferred to use this pattern only in instances where the containers are tightly coupled.
Whether you opt for one container per Pod or multiple containers per Pod, it’s essential to choose the right approach that suits the specific needs of your application.
Pods are designed as disposable units. Pods are assigned to a Node in the cluster upon creation (either directly or indirectly by a controller). These remain there until they finish execution, are deleted, are evicted due to lack of resources or if the node fails.

Basic K8s Objects in a Kubernetes Cluster

In this article, only four types/kinds of Kubernetes Objects will be shown, elucidated and used in the demonstration.

Deployment.
Service
HorizontalAutoScaler
ConfigMaps (excluded from demonstration/example below)

Most of the details for each type of K8s object have been extracted from the official documentation.

Kind: Deployment

A Deployment provides declarative updates for Pods and ReplicaSets.

You describe a desired state in a Deployment, and the Deployment Controller changes the current state of the cluster to the desired state at a controlled rate.

Deployments are where you define the following specifications for a container (the other fields within specifications have default values):

Name: Name by which this container image is referred.
Image: Container image (the actual image) that will be pulled here. The image can be any of the following:
1) A local docker image.
2) One hosted in a public repository (e.g. Dockerhub).
3) One stored within a dedicated registry (e.g. GCR or Amazon ECR).
Ports: Port of the Docker container that will be exposed.
Resources: This defines the resource allocated to a single container. The two most important are:
1) Requests — > The minimum amount of computing resources (e.g. CPU, RAM) required.
2) Limits — > The maximum amount of computing resources required.

An example is

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-demo-deployment
spec:
  replicas: 1 
  selector:
    matchLabels:
      app: k8s-demo
  template:
    metadata:
      labels:
        app: k8s-demo
    spec:
      containers:
      - name: k8s-demo
        image: k8s_demo:v1
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 4000
        resources:
          requests:
            memory: 320M
            cpu: 50m
          limits:
            memory: 640M
            cpu: 100m

Kind: Service

A Service is a method for exposing a network application that is running as one or more Pods in your cluster.

A key goal of Kubernetes Services is to eliminate the need for modifying your existing application to adopt a new and unfamiliar service discovery mechanism. Whether one runs cloud-native code or a containerized older application, Kubernetes allows one to execute code within Pods. A Service can expose a group of Pods on the network, thus enabling clients to interact with them.

The Service API, part of Kubernetes, is an abstraction to help you expose groups of Pods over a network. Each Service object defines a logical set of endpoints (usually, these are Pods) along with a policy about how to make those pods accessible.

By default, Pods cannot be used as an external service and are accessible only within the cluster.

An example is

apiVersion: v1
kind: Service
metadata:
  name: kubernetes-demo-service
  namespace: default
  labels:
    app: k8s-demo
spec:
  type: NodePort  # By Default this is ClusterIP: Meant to be used as an internel service.
  ports:
  - port: 4000
    targetPort: 4000 # The port of the pod (Must match the container port defined for pod).
                     #  Note: Keeping both port and targetPort same is a good practice for simplicity.
    nodePort: 30001  # Exposes the Service on each Node's IP at a static port mentioned here.
                     # Note: Nodeport has a dedicated range and cannot be set arbitrarily.
    protocol: TCP
    name: k8s-demo-http
  selector:
    app: k8s-demo

Kind: HorizontalPodAutoscaler

A HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment), with the aim of automatically scaling the workload to match demand.

Horizontal scaling means deploying more Pods to respond to increased load. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example, memory or CPU) to the Pods that are already running for the workload.

An example is

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  labels:
    app: k8s-demo
  name: k8s-demo-hpa
  namespace: default
spec:
  maxReplicas: 5
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: k8s-demo-deployment
  targetCPUUtilizationPercentage: 20

Kind: ConfigMaps

ConfigMaps are a K8s object where you store non-confidential data in key-value pairs.

ConfigMaps are different from Secrets (another K8s object where you store confidential information). Examples of the types of data commonly stored in ConfigMaps are:

Environment variables. For example, Path to Model Directory and Model Name in case one uses a serving architecture such as TorchServe, TensorFlow Serving, Nvidia Triton Inference Server etc.
Command Line Arguments.

Case Study: OpenAI

OpenAI is a research organization dedicated to developing advanced artificial intelligence (AI) technologies and making them accessible to everyone. One of its most significant achievements to date is the development of ChatGPT, one of the largest and most sophisticated language models ever created.

A significant obstacle this company had to overcome was the need for an infrastructure for deep learning that would allow experiments to be run either in the cloud or its data center and one which would scale smoothly. The following are extracts from a case study published on Kubernetes.io.

“We use Kubernetes mainly as a batch scheduling system and rely on our autoscaler to dynamically scale up and down our cluster … This lets us significantly reduce costs for idle nodes, while still providing low latency and rapid iteration.”

“One of our researchers who is working on a new distributed training system has been able to get his experiment running in two or three days. In a week or two he scaled it out to hundreds of GPUs. Previously, that would have easily been a couple of months of work.”

— Christopher Berner, Head of Compute at OpenAI.

Here’s an extract from a blog published on OpenAI.

“We’ve found Kubernetes to be an exceptionally flexible platform for our research needs. It has the ability to scale up to meet the most demanding workloads we’ve put on it.”

Benjamin Chess, Applied Engineering Infrastructure at OpenAI and Eric Sigler.

Autoscalar for clusters means automatically:

Facilitating the procurement and utilization of computing resources, specifically on cloud service platforms, during periods of high demand for optimal operational performance.
Relinquish resources when the demand is low (To save on costs if one is using cloud services).

Case Study: Spotify

Spotify is a music streaming platform. With over 422 million users in Q1 2022, Spotify is among the top music platforms.

Spotify is built on a cloud-based infrastructure for efficient data storage and retrieval. The platform can handle massive user data, including music preferences, playlists, and more.

An early adopter of microservices and Docker, Spotify had containerized microservices running across its fleet of VMs using its homegrown container orchestration system, Helios (The repository was archived on September 29, 2021).

Spotify started transitioning to Kubernetes in late 2017 after realizing

“Having a small team working on the features was just not as efficient as adopting something that was supported by a much bigger community.”

Jai Chakrabarti, Director of Engineering, Infrastructure and Operations

The team spent much of 2018 addressing the core technology issues required for a migration, which started late that year and was a big focus for 2019. Here are the following statements made by the team according to this case study

“A small percentage of our fleet has been migrated to Kubernetes, and some of the things that we’ve heard from our internal teams are that they have less of a need to focus on manual capacity provisioning and more time to focus on delivering features for Spotify,”

Jai Chakrabarti, Director of Engineering, Infrastructure and Operations

“The biggest service currently running on Kubernetes takes about 10 million requests per second as an aggregate service and benefits greatly from autoscaling.”

“Before, teams would have to wait for an hour to create a new service and get an operational host to run it in production, but with Kubernetes, they can do that on the order of seconds and minutes.” In addition, with Kubernetes’s bin-packing and multi-tenancy capabilities, CPU utilization has improved on average two- to threefold.

James Wen, Site Reliability Engineer.

In short, the advantages of moving from Helios (An in-house container orchestrator) to K8s (Open-Source) were:

Less time on manual capacity provisioning -> More time for developing, testing and shipping features.
Autoscaling, a feature of K8s, allowed their infrastructure to handle up to 10 million requests per second -> Less dissatisfied users.
Time spent shipping new services and features went down from hours to minutes and seconds.
Better resouutilisationtion.

Remember, these are all for 422 million users on Spotify.

Creating and Deploying to a Local K8s Cluster

Here is the link to the repository made by yours truly. The part until the Docker was done previously (my first project), now it has been extended to deploying the Docker image to a local K8s cluster and serving using batch prediction.

This repository will walk you through the following:

Analyzing Data and creating a Pipeline to extract and clean textual data.
Training the following models:
1) Sentiment Analysis through Word Embeddings.
2) GRU implementation of Twitter Sentiment Analysis.
Creating an API in Flask to serve requests using Batch Prediction
Creating a Docker Image of your Application.
Creating a Kubernetes cluster.
Deploying your Docker Image to your Kubernetes cluster for serving (with a Horizontal Pod Autoscalar to take care of your application’s throughput).

https://github.com/Muhammad-Ibrahim-Khan/Twitter_Sentiment_Analysis_with_Kubernetes

Conclusion

By providing a scalable and efficient infrastructure, Kubernetes ensures that applications can be easily managed, deployed, and scaled to meet the growing demands of users. With Kubernetes, applications can seamlessly scale and adapt to handle varying workloads and traffic.

Kubernetes’ automated rollout feature simplifies various deployment types. Blue-green deployments or canary, Kubernetes automates the process, ensuring smooth transitions, minimal downtime, and improved release management. Developers can confidently release updates, perform A/B testing, and maintain application availability, streamlining the deployment process for faster time-to-market and enhanced user experience.

As businesses evolve and technology advances, so is the need for scalable applications. This makes Kubernetes a crucial component in the development and success of modern software systems.

References

Additional Reading (Optional)

Muhammad Ibrahim Khan is a machine learning engineer at Red Buffer. You can contribute to his Twitter Sentiment Analysis project on GitHub by clicking here.

An Introduction to Kubernetes

An introduction to the most widely utilized open-source container orchestrator, its achievements and significance through case studies such as OpenAI and Spotify, and a repository for creating your first K8s cluster.

Table of Content

Pre-Requisites

A Brief History of Deployments

Kubernetes

Features of Kubernetes

Major Components of Kubernetes

Control Plane

API server

Controller Manager

Scheduler

etcd

Cloud Controller Manager

Worker Nodes

Kubelet

Kube-proxy

Pods

Basic K8s Objects in a Kubernetes Cluster

Kind: Deployment

Kind: Service

Kind: HorizontalPodAutoscaler

Kind: ConfigMaps

Case Study: OpenAI

Case Study: Spotify

Creating and Deploying to a Local K8s Cluster

Conclusion

References

Additional Reading (Optional)

redbuffer

Next PostYear In Review — 2021 On Clutch For Red Buffer Inc.

Syed Nauyan Rashid

Shahqaan Qasim

Aisha Humayun

Farhan Salam

Raziah Gondal

Tayyab Tariq

Miqdad Ali Nasser