Portworx Tutorial : Demonstrate HA Cassandra Stateful Application

kubernetes

Portworx is a popular Kubernetes persistent storage and Docker storage solution. It’s a clustered block storage solution and provides a Cloud-Native layer from which containerized stateful applications programmatically consume block, file, and object storage services directly through the scheduler.

With Portworx, you can manage any database or stateful service on any infrastructure using any container scheduler. You get a single data management layer for all of your stateful services, no matter where they run.

In this post, we will learn how to deploy Cassandra to Kubernetes and use Portworx Volumes to provide HA capability:

Install, configure Portworx
Use the Portworx Storage Class to create a PVC with 3 replicas of the data
Use a simple YAML file to deploy Cassandra using this storage class
How to validate data persistence by deleting the Cassandra pod

First, we will deploy Cassandra in a StatefulSet with a single node (replicas=1) to show the basics of node failover. We will create sample data, force Cassandra to flush the data to disk, and then failover the Cassandra pod and show how it comes back up with its data intact. Then, we’re going to show how we can scale the cluster to 3 nodes and dynamically create volumes for each.

Quick Snapshot

Step #1.Validate Kubernetes
Step #2.Install Portworx
Step #3: Create StorageClass
Step #4: Deploy Cassandra
Step #5: Create a Cassandra Database
Step #6: Delete Cassandra Instance
Step #7: Verify data is still available
Step #8: Scale the cluster
Additional Resources :

Step #1.Validate Kubernetes

Use kubectl get nodes to check if the Kubernetes nodes are ready.

Image – Kubernetes Pods are ready

Step #2.Install Portworx

Portworx requires at least 2 to 3 nodes in the cluster to have dedicated storage for use. It will then carve out virtual volumes from these storage pools. In this example, we use a 20GB block device that exists on each node.

Image – Choose the device to install portworx

Image – Install Portworx

In the above install command, note the below:

c=px-demo specifies the cluster name
b=true specifies to use internal etcd
kbVer=${VER} specifies the Kubernetes version
s=/dev/vdb specifies the block device to use

Use kubectl get pods -n kube-system -l name=portworx -o w to check if the Portworx pods are ready and status is in RUNNING state.

Image – Portworx pods are ready

You can also take a look at the cluster status using the pxctl command as well.

Now, we have the Portworx cluster ready, we can proceed to the next step.

Step #3: Create StorageClass

StorageClass provides a way to describe the “classes” of storage. Various classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators.

Storage class may differ according to the needs of the business application. Now for our scenario, we have defined below storage class with a replication factor of 2 to accelerate Cassandra node recovery and we also defined a group name for Cassandra so that we can take 3DSnapshots.

Image – Cassandra StorageClass

Refer here for a full list of supported parameters for Portworx volume.

Create the storage class using kubectl create command.

Image – Create the storage class

In case of production environments, you would also have to add the "fg=true" parameter to your StorageClass to ensure that Portworx places each Cassandra volume and their replica on separate nodes so that in case of node failure we never failover to a node where it is already running. To enable this feature with a 3 volume group and 2 replicas you need a minimum of 6 worker nodes.

We have got StorageClass ready, let’s deploy Cassandra on the cluster.

Step #4: Deploy Cassandra

In this step, we are going to deploy a 3 node Cassandra application using a stateful set. StatefulSet is used to manage stateful applications i.e., maintains a sticky identity for each of their Pods. Kubernetes maintains a persistent identifier so that it can maintain across any rescheduling.

Create below Cassandra StatefulSet that uses a Portworx PVC created in the earlier step.

apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
---
apiVersion: "apps/v1beta1"
kind: StatefulSet
metadata:
name: cassandra
spec:
serviceName: cassandra
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
# Use the stork scheduler to enable more efficient placement of the pods
schedulerName: stork
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v14
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: CASSANDRA_AUTO_BOOTSTRAP
value: "false"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
volumeClaimTemplates:
- metadata:
name: cassandra-data
annotations:
volume.beta.kubernetes.io/storage-class: px-storageclass
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: cqlsh
spec:
containers:
- name: cqlsh
image: mikewright/cqlsh
command:
- sh
- -c
- "exec tail -f /dev/null"

Create the StatefulSet using kubectl create command.

Image – Create a Cassandra StatefulSet

Use kubectl get pods the command to validate if the pod is READY.

Image – Validate if pods are ready

As an optional step, you can use pxctl the command line to inspect the volumes underlying volumes of Cassandra pod. that we have created.

Image – Inspect volume using pxctl

From the output, infer the following

State indicates the volume is attached and shows the node on which it is attached and This is the node where the Kubernetes pod is running.
HA shows the number of configured replicas for this volume.
Labels show the name of the PVC for this volume.
Replica sets on nodes shows the px nodes on which volume is replicated.

Now that we have Cassandra ready, we can create a sample database and populate some data.

Step #5: Create a Cassandra Database

Initialize a sample database on our Cassandra instance using CQL commands.

Image – Connect to CQL Shell session

Next step is to create a keyspace with replication of 3 and insert some sample data:

Image – Create a keyspace and insert sample data

Once the data is inserted, check if the same has been created.

Image – Select rows from the keyspace

Now that we have got the records created, we can proceed to check if the failover works properly or not but before that, we will have to flush (use nodetool flush command) the in-memory data onto disk so that when the Cassandra starts on another node it will have access to the data that was just written. Cassandra by default keeps data in memory and only flushes it to disk after 10 minutes by default.

Image – Flush data to disk

Step #6: Delete Cassandra Instance

Let us simulate failure by cordoning the node where Cassandra is running and then deleting the Cassandra pod. The pod will then be rescheduled to make sure it lands on one of the nodes that have the replica of the data.

Image – Delete Cassandra instance

Once the Cassandra pod gets deleted, Kubernetes will start to create a new Cassandra pod on another node. Use kubectl get pods to verify, when the pod comes back up it will be in the RUNNING and READY(1/1) state.

Image – Verify replacement pod starts running

Also, we have to uncordon the node before the next step.

Image – Uncordon node

We have the new Cassandra pod running, let’s check if the database we previously created is still intact.

Step #7: Verify data is still available

Let’s start a CQL Shell session and validate if the data is available.

Image – Verify if data is still available

Congrats! we have our data and survived the node failure too!

Step #8: Scale the cluster

We will scale our Cassandra stateful set to 3 replicas using kubectl scale command.

Image – Scale the cluster

You can watch the pods getting added:

Image – Cluster scaled

It will take a minute or two for all three Cassandra nodes to come online and discover each other.

Additional Resources :

Summary

Article Name

Portworx Tutorial : Demonstrate HA Cassandra Stateful Application

Description

In this post, we will learn how to deploy Cassandra to Kubernetes and use Portworx Volumes to provide HA capability:

Author

Karthik

Publisher Name

Upnxtblog

Publisher Logo

Karthik

Allo! My name is Karthik,experienced IT professional.Upnxtblog covers key technology trends that impacts technology industry.This includes Cloud computing,Blockchain,Machine learning & AI,Best mobile apps, Best tools/open source libs etc.,I hope you would love it and you can be sure that each post is fantastic and will be worth your time.

Next How to author and enforce policies using Open Policy Agent Gatekeeper »

Previous « How to Merge Excel Files and Worksheets in Bulk?

Published by

Karthik

Tags: cassandrakubernetesportworx

5 years ago

Unlock the Potential of Java Microservices for Scalable Solutions
In today's rapidly evolving digital landscape, businesses and developers are continuously searching for efficient, scalable…
How You Can Improve Your Business’s Performance with a Kubernetes Ingress Controller
Improving your business is vital to not only its progress but also its survival. There…
Enforcing policies with Kubewarden on Amazon EKS
According to Red Hat's 2022 State of Kubernetes Security Report, respondents stated that exposures due…

Deciding Between Customizing Your Current Tech or Building Your Own Solutions

As a business, you need to make a lot of important decisions to keep things…

1 week ago

Cloud Computing

How to Secure Your APIs: A Step-by-Step Guide

If you are software programming in the era of ‘digital first’, APIs (Application Programming Interfaces)…

1 week ago

Trending

How Hackers Can Attack Smartwatches

Smartwatches have changed the way we organize our daily lives. They not only keep us…

2 weeks ago

Machine Learning Guides

AI and Predictive Marketing: Reaching the Right Audience at the Right Time

You’ve been targeting people, developing interesting content and managing marketing campaigns. However, it appears that…

4 weeks ago

Wearable Tech: How Smartwatches Are Evolving

The world of wearable technology has been evolving at a rapid pace, with one of…

3 months ago

Trending

Looking Back at 2024: A Year of Innovation and Growth on Upnxtblog

As we wrap up 2024, it’s time to reflect on the incredible journey we’ve had…

4 months ago

This website uses cookies.

Portworx Tutorial : Demonstrate HA Cassandra Stateful Application

Step #1.Validate Kubernetes

Step #2.Install Portworx

Step #3: Create StorageClass

Step #4: Deploy Cassandra

Step #5: Create a Cassandra Database

Step #6: Delete Cassandra Instance

Step #7: Verify data is still available

Step #8: Scale the cluster

Additional Resources :

Recent Posts

Deciding Between Customizing Your Current Tech or Building Your Own Solutions

How to Secure Your APIs: A Step-by-Step Guide

How Hackers Can Attack Smartwatches

AI and Predictive Marketing: Reaching the Right Audience at the Right Time

Wearable Tech: How Smartwatches Are Evolving

Looking Back at 2024: A Year of Innovation and Growth on Upnxtblog

Tag Cloud

Portworx Tutorial : Demonstrate HA Cassandra Stateful Application

Step #1.Validate Kubernetes

Step #2.Install Portworx

Step #3: Create StorageClass

Step #4: Deploy Cassandra

Step #5: Create a Cassandra Database

Step #6: Delete Cassandra Instance

Step #7: Verify data is still available

Step #8: Scale the cluster

Additional Resources :

Related Post

Recent Posts

Deciding Between Customizing Your Current Tech or Building Your Own Solutions

How to Secure Your APIs: A Step-by-Step Guide

How Hackers Can Attack Smartwatches

AI and Predictive Marketing: Reaching the Right Audience at the Right Time

Wearable Tech: How Smartwatches Are Evolving

Looking Back at 2024: A Year of Innovation and Growth on Upnxtblog

Tag Cloud