Kubernetes stateful sets

UNDERSTANDING STATEFULSETS IN KUBERNETES AND WHEN TO USE THEM

Introduction.

How to deploy stateful applications will be one of the major choices you have to make as a DevOps engineer. DevOps engineers are constantly faced with decisions like whether to deploy a stateful application within or outside of a Kubernetes cluster. How they would deploy it if they chose to do so within Kubernetes. While deploying to a Kubernetes cluster may not be perfect for all applications, some apps may be viable for deployment in a stateful Kubernetes cluster.

At the end of this discourse, readers would understand what kind of Applications are called stateful applications, and the options for deploying stateful applications in Kubernetes. Readers would also get the idea of what Statefulsets in Kubernetes is, why it’s used as opposed to Deployments, and whether using Kubernetes for your stateful application is ideal.

Prerequisites

it helps to understand what pods, clusters, file systems, horizontal scaling, objects, and deployment mean in Kubernetes.

Kubernetes as an Orchestration Tool

Kubernetes is a container orchestration tool, which is used for containerized deployment of Applications. Applications that are deployed in a Kubernetes cluster, can either be stateful or stateless.

Although there is an argument among developers along the Lines of whether an app can be stateful or not, some developers are of the firm opinion that all software has some degree of state in them. They further opine that in some applications, it’s not obvious, because the state of the application does not really impact users and those are usually called stateless applications. Other applications like databases, message queues, and caches have a much higher degree of state because they rely on storage.

A stateless application is a kind of system, where transactions do not refer to past transactions. Each transaction is treated as if it were made from scratch.

Consider a real-world example. Some applications don’t store your prior transactions or usage history after you’ve used them for a while; instead, you must log in anew as a brand-new user the next time you try to use them. For example, while visiting a news website that doesn’t need you to sign up or log in, you can simply browse the pages you need, then leave. By the time you return to the page, you simply browse the material like a new user.

A stateful application on the other is such that stores previous transactions, relies on that storage, and allows users to pick up transactions from where they stopped. Stateful applications are necessary for creating a unique experience for each user that uses your application. Take, for example, Banking apps, while 5 people may use the same banking app, they may not have the same amount of money on their banking app, thereby giving each person a unique experience as to what they can buy through the app or not. To ensure a smooth experience for users, it is necessary that storage is persisted and managed properly.

Stateful Applications in Kubernetes.

As noted earlier, for a stateful Application to function optimally, storage must be persisted. Kubernetes during its early days was not optimized for stateful Applications and Developers had some challenges deploying Stateful applications into Kubernetes, and one of such was consistency of data.

Stateful Applications usually require authentication and session cookies in order to respond to requests, such that when a user makes a request to your server for the first time, an authentication system is activated, and a cookie session is stored in the user’s browser for easy access.

With the recent widespread adoption of microservices architecture, applications are typically deployed on several servers with a load balancer in front. The load balancer subsequently routes users to servers based on the load on each server. This implies that each time a user wants to interact with your servers, a different server might serve the user.

if the load balancer that sits in front of your application workloads sends the user to a different server without the session tokens during a subsequent request, the user’s request will be denied, this can cause a loss of revenue for businesses.

Another problem developers faced was data persistency. a Kubernetes cluster does not inherently keep data, when containers sit in clusters, virtual file systems are created as well, data is then written into these file systems. Containers by nature are ephemeral, which means they can die off at any time when a container in a pod dies, the files system dies with it, and is replaced by another pod instantly with a new file system.

This creates difficulties for users of business-critical workloads as the data on the file system would be lost whenever the pod dies. To solve this problem, Kubernetes makes use of volumes. Volumes are data storage that seats on the host machine, which then mounts into the files system of the containers. That means, the file system which the container uses is not native to the container, thus any data that’s written inside this filesystem persists even when that container is destroyed.

Deployment of Stateful Applications to Kubernetes.

To deal with some of the problems highlighted above. Developers usually deployed their stateless applications in Kubernetes Clusters and connected them to stateful Applications like databases running elsewhere.

With Statefulsets, Applications with State can be deployed directly into Kubernetes clusters. The Kubernetes controller used to run stateful applications as containers (Pods) in the Kubernetes cluster is called a StatefulSet. Instead of giving each replica Pod a random ID, StatefulSets provide each Pod a persistent identity — an ordinal number beginning with zero. Statefulsets help running Applications to have unique, persistent network IDs that are retained through rescheduling.

Stateful sets systematically scale up and scale down pods, and ensure that data volumes are not destroyed. When the containers themselves are shut down for any reason, the volume's storage are still protected, and when the containers are restarted, they are simply attached to the existing volumes.

Should You deploy stateful apps into Kubernetes?’

Whether your Stateful Application should be deployed to Kubernetes depends on your needs and requirements as an organization. Some developers are of the opinion that the stateful sets Workload API provided by Kubernetes is too simplistic to run Critical stateful workloads like databases. Databases require several tasks, including backup creation, data replication, cluster expansion, standby cluster creation, disaster recovery, monitoring, and more.

Tooling systems like a combination of Operators and Custom Resource Definitions (CRD) have been advocated as the proper way to deploy Databases workloads in Kubernetes. In Using Operators and CRDs, Kubernetes objects are created to represent database clusters that can be managed as any other resource, while the Operator will watch the Kubernetes API for any changes and take action. Ultimately, there is no hard and fast rule on how to deploy stateful workloads in Kubernetes. It all depends on the needs of an organization.

In summary, while Kubernetes appears to have become the defacto containerization tool, the deployment of stateful workloads in Kubernetes is still a bit of a challenge for developers. Different developers customize their own approaches based on their own unique challenges and business demands. However, a combination of Operators and CRDs poses some promise for solutions.

Olusegun The Lawyer turn coder

Olusegun The Lawyer turn coder

StatefulSets In Kubernetes

Table of contents

No headings in the article.