How to aggregate Docker Container logs and analyse with ELK stack ?

ELK Docker

Today we are going to learn about how to aggregate Docker container logs and analyze the same centrally using ELK stack. ELK stack comprises of Elasticsearch, Logstash, and Kibana tools. Elasticsearch is a highly scalable open-source full-text search and analytics engine.

It allows you to store, search, and analyze big volumes of data quickly and in near real-time. Kibana is like a window into the Elastic Stack. It enables visual exploration and real-time analysis of your data in Elasticsearch. Logstash is the central dataflow engine in the Elastic Stack for gathering, enriching, and unifying all of your data regardless of format or schema. If you want to learn more about key concepts of the ELK stack, please check out earlier posts here.

As your container volume increases, it’s difficult to manage them and their logs. There is a need for a centralized solution to take care of log aggregation, monitoring, and analysis. Luckily we already have ELK stack which does Log aggregation well but Docker container logs need to be routed to Logstash.

For log routing from each of the containers, we are going to use Logspout utility that attaches to all containers on a host, then routes their logs wherever we want. Here in our case, we are going to push it to Logstash and let it handle shipping, transformation, etc., In this article, we are going to use ElasticSearch to store, index the logs. Logstash ships manage to transform logs into a consistent format and use Kibana to visualize the logs.

This quickstart assumes a basic understanding of Docker concepts, please refer to earlier posts for understanding Docker & how to install and containerize applications.

With this context now, let’s check out how to aggregate Docker Container logs and analyze the same.

Quick Snapshot

Log Aggregation Architecture
Step #1. Prepare Docker Compose scripts for ELK stack and Logspout configuration
Step #2.Launch Docker Containers
Step #3.Define Index Pattern
Step #4.Visualize the data
Useful Resources

Log Aggregation Architecture

Before we head to the tutorial, below is what we want to achieve. All logs from the Docker containers will be routed to Logstash using Logspout over UDP protocol. Logstash will then serve as a Data collection engine, pushes it to Elasticsearch for indexing, making it available for searching. Post which using Kibana, we can analyze the logs, create visualizations as we want.

Image – Docker Log aggregation using Logspout,ELK stack

Next, we head over to the implementation of the same, here is an overview of the steps involved

Prepare Docker Compose scripts for ELK stack and Logspout configuration
Launch Docker Containers
Define an index pattern
Visualize the data

Step #1. Prepare Docker Compose scripts for ELK stack and Logspout configuration

ElasticSearch Docker Configuration: We are going to use the official image, expose the two ports (9200/9300) as required. In production environments, make sure that the above ports are only accessible from internal and restrict access to the public.

elasticsearch:
image: elasticsearch:1.5.2
ports:
- '9200:9200'
- '9300:9300'

Kibana Docker Configuration: Kibana needs to connect to an instance of ElasticSearch so that visualizations can be made. Add ELASTICSEARCH_URL environment variable and specify ElasticSearch Instance to connect to. #5601 default port needs to be exposed.

kibana:
image: kibana:4.1.2
links:
- elasticsearch
environment:
- ELASTICSEARCH_URL=http://elasticsearch:9200
ports:
- '5601:5601'
depends_on:
- elasticsearch

Logstash Docker Configuration: Logstash can process data from any source and normalizes it for storing. In the command section, you can note that Logstash will receive input on UDP protocol at port #5000 and pushes the data to ElasticSearch instance.

Also note, below is for demonstration purpose only but actually, Logstash can dynamically unify data from various sources and normalize the data into any of the destinations. You can also cleanse your data for diverse advanced downstream analytics and visualization use cases.

logstash:
    image: logstash:2.1.1
    environment:
      - STDOUT=true
    links:
      - elasticsearch
    depends_on:
       - elasticsearch
       - kibana
    command: 'logstash -e "input { udp { port => 5000 } } output { elasticsearch { hosts => elasticsearch } }"'

We now have an ELK stack configuration ready. Next steps, we’ll explore how to push logs into the system using Logspout.

Logspout Docker Configuration: Logspout will monitor Docker events. When a new container is launched it will automatically start collecting its logs. Every log line will be pushed into Logstash using the UDP protocol. Below is the Docker configuration, we are using logspout v3 but there are the latest versions available.

  logspout:
    image: gliderlabs/logspout:v3
    command: 'udp://logstash:5000'
    links:
      - logstash
    volumes:
      - '/var/run/docker.sock:/tmp/docker.sock'
    depends_on:
       - elasticsearch
       - logstash
       - kibana

Finally, our Docker Compose configuration will look the one below

version: '3.3'
services:
logspout:
image: gliderlabs/logspout:v3
command: 'udp://logstash:5000'
links:
- logstash
volumes:
- '/var/run/docker.sock:/tmp/docker.sock'
depends_on:
- elasticsearch
- logstash
- kibana
logstash:
image: logstash:2.1.1
environment:
- STDOUT=true
links:
- elasticsearch
depends_on:
- elasticsearch
- kibana
command: 'logstash -e "input { udp { port => 5000 } } output { elasticsearch { hosts => elasticsearch } }"'
kibana:
image: kibana:4.1.2
links:
- elasticsearch
environment:
- ELASTICSEARCH_URL=http://elasticsearch:9200
ports:
- '5601:5601'
depends_on:
- elasticsearch
elasticsearch:
image: elasticsearch:1.5.2
ports:
- '9200:9200'
- '9300:9300'

Step #2.Launch Docker Containers

Now that Docker compose script is ready, launch containers using docker-compose up command.

Image – Start ELK stack

ElasticSearch and Kibana can take a few minutes to start. When Logstash launches it starts generating indexes in Elasticsearch.If you have noticed, we have not created any application which is generating logs, here we are going to use startup logs generated by ElasticSearch, Kibana & Logstash itself.

If you want to ignore logs for a specific container then you can add LOGSPOUT=ignore as an environment variable on Docker compose a script. For more information on other Logspout environment variables, please check here.

Once all the containers are up, the next step is to launch Kibana and start defining the index pattern.

Step #3.Define Index Pattern

Now that startup logs would have been loaded to Elasticsearch, we would need to create an index pattern. An index is a collection of documents that have similar characteristics. An index is identified by a name and this name is used to refer to the index when performing indexing, search, update, and delete operations against the documents in it. Indexing is similar to the creation and update process of CRUD operations.

Launch Kibana on port # 5601,under ‘Indices’ / ‘Management’ (on latest versions) tab you can find option to create Index pattern.Enter the name of the index ex.logstash-*. Kibana will then ask for a field containing a time/timestamp which it should use for visualizing time-series data. for our case, this is the “@timestamp” field.

Image – Define Index Pattern

Now that we have created the index pattern, it would take a few minutes to complete. The next step is to create visualizations. Before that, we can check the data from the ‘Discover’ tab.

Image – Check the log data under Discover tab

We have enough data to visualize, we are ready to create Visualization

Step #4.Visualize the data

For demonstration purposes, I have created the below visualizations and attached them to the Dashboard.

Metrics chart to display count of log events
Area chart to show count of logs events against time

Image – Log events Dashboard with visualizations

Congrats! we have learned how to aggregate all Docker container logs and analyze the same centrally using ELK stack.

As always there is more to what was covered here! If you have questions, please post it in the comments section.