How to aggregate logs and analyse with EFK stack ?

Today we are going to learn about how to aggregate logs and analyze the same centrally using the EFK stack. EFK stack comprises Elasticsearch, Fluent Bit, and Kibana tools. Elasticsearch is a highly scalable open-source full-text search and analytics engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Fluent Bit is an open-source specialized data collector. It provides built-in metrics and general purpose output interfaces for centralized collectors such as Fluentd. Kibana is like a window into the Elastic Stack. It enables visual exploration and real-time analysis of your data in Elasticsearch. We can use tracking query load to understanding the way requests flow through your apps.

As your container volume increases, it’s difficult to manage them and their logs. There is a need for a centralized solution to take care of log aggregation, monitoring, and analysis. Luckily we already have an EFK stack that does index, process, and forward.

In this article, we are going to use ElasticSearch to store, index the logs. Fluent Bit process and forwards logs into a consistent format and use Kibana to visualize the logs. EFK stack is the preferred choice for containerized environments like Kubernetes.

If you’re new to Kubernetes, I recommend reading the following hands-on guides before reading this one

Step #1: Validate the Kubernetes environment

Make sure the cluster is up and running, use kubectl get nodes to check if the Kubernetes nodes are ready. Also, validate the Kubernetes version, it should be 1.18 or more.

Step #2: Configure Elasticsearch

Create a new namespace named “Logs” for this installation.

Add the Elastic search chart repository for the Helm chart to be installed.

Deploy the ElasticSearch.

Verify if the Elasticsearch deployment has succeeded using kubectl get pods' command. Check if the pods are ready and the status is in RUNNING state.

Now that the Elastic Search configuration is ready, the next step is to configure Fluent Bit for processing and forwarding of the logs.

Step #3: Configure Fluent Bit

Install Fluent Bit and pass the ElasticSearch service endpoint as a chart parameter. This chart will install a DaemonSet that will start a Fluent Bit pod on each node. With this, each Fluent Bit service will collect the logs from each node and stream them to ElasticSearch.

Both Fluentd and Fluent Bit can work as Aggregators or Forwarders, they both can complement each other or can be used as standalone solutions. Choosing which one to use depends on the use-cases. More details here.

Add the Fluent Bit chart repository for the Helm chart to be installed.

Image – Add the chart repository for the Helm chart

Install the Fluent Bit chart.

Verify if the Fluent Bit deployment has succeeded using kubectl get pods' command. Check if the pods are ready and the status is in RUNNING state.

Image – Verify if the Fluent Bit deployment is ready

Now that the Fluent Bit configuration is ready, the next step is to configure Kibana for visualization of the logs.

Step #3: Configure Kibana

Deploy Kibana. Once the deployment is complete, the service will be exposed on a NodePort at 31000.

Now that the Kibana installation is ready, validate all the deployments of ElasticSearch, Fluent Bit, and Kibana is ready by using kubectl get pods,deployment,services' command. Check if all of them are ready and the status is in RUNNING state.

Image – Verify if deployments are running

Now we have a fully functional EFK stack configured and running. The next step is to launch Kibana and start defining the index pattern.

Step #4.Define Index Pattern

Now that startup logs would have been loaded to Elasticsearch, we would need to create an index pattern. An index is a collection of documents that have similar characteristics. An index is identified by a name and this name is used to refer to the index when performing indexing, search, update, and delete operations against the documents in it. Indexing is similar to the creation and update process of CRUD operations.

Launch Kibana on port # 31000, On the Welcome page, click Explore on my own. From the left-hand drop-down menu select the Discover item. Click on the button Create index pattern on the top. Enter the name of the index ex.logstash-*.

Kibana will then ask for a field containing a time/timestamp which it should use for visualizing time-series data. for our case, this is the “@timestamp” field.

Now that we have created the index pattern, it would take a few minutes to complete.

From the left-hand drop-down menu, select the Discover item.

Image – Check the log data under Discover tab

Step #5.Filter & view the data

On the right-hand side, there would be a listing of all the log events and Fields available for filtering. Either you can add a new filter or use the KQL to filter the events by choosing the kubernetes.podname field.

The log list now is filtered to show log events from the events from that specific POD. You can expand each event to reveal further details.

When Fluent Bit log processor runs, it will read, parse and filter the logs of every POD on the Kubernetes cluster and will enrich each entry with the following information:

Pod Name
Pod ID
Container Name
Container ID
Labels
Annotations

Congrats! we have learned how to aggregate all Kubernetes container logs and analyze the same centrally using the EFK stack.

As always there is more to what was covered here! If you have questions, please post them in the comments section.

Like this post? Don’t forget to share it!

Useful Resources