How to aggregate Docker Container logs and analyse with ELK stack ?
Today we are going to learn about how to aggregate Docker container logs and analyze the same centrally using ELK stack. ELK stack comprises of Elasticsearch, Logstash, and Kibana tools. Elasticsearch is a highly scalable open-source full-text search and analytics engine.
It allows you to store, search, and analyze big volumes of data quickly and in near real-time. Kibana is like a window into the Elastic Stack. It enables visual exploration and real-time analysis of your data in Elasticsearch. Logstash is the central dataflow engine in the Elastic Stack for gathering, enriching, and unifying all of your data regardless of format or schema. If you want to learn more about key concepts of the ELK stack, please check out earlier posts here.
As your container volume increases, it’s difficult to manage them and their logs. There is a need for a centralized solution to take care of log aggregation, monitoring, and analysis. Luckily we already have ELK stack which does Log aggregation well but Docker container logs need to be routed to Logstash.
For log routing from each of the containers, we are going to use Logspout utility that attaches to all containers on a host, then routes their logs wherever we want. Here in our case, we are going to push it to Logstash and let it handle shipping, transformation, etc., In this article, we are going to use ElasticSearch to store, index the logs. Logstash ships manage to transform logs into a consistent format and use Kibana to visualize the logs.
This quickstart assumes a basic understanding of Docker concepts, please refer to earlier posts for understanding Docker & how to install and containerize applications.
With this context now, let’s check out how to aggregate Docker Container logs and analyze the same.
Quick Snapshot
Log Aggregation Architecture
Before we head to the tutorial, below is what we want to achieve. All logs from the Docker containers will be routed to Logstash using Logspout over UDP protocol. Logstash will then serve as a Data collection engine, pushes it to Elasticsearch for indexing, making it available for searching. Post which using Kibana, we can analyze the logs, create visualizations as we want.
Next, we head over to the implementation of the same, here is an overview of the steps involved
- Prepare Docker Compose scripts for ELK stack and Logspout configuration
- Launch Docker Containers
- Define an index pattern
- Visualize the data
Step #1. Prepare Docker Compose scripts for ELK stack and Logspout configuration
ElasticSearch Docker Configuration:Â We are going to use the official image, expose the two ports (9200/9300) as required. In production environments, make sure that the above ports are only accessible from internal and restrict access to the public.
elasticsearch: image: elasticsearch:1.5.2 ports: - '9200:9200' - '9300:9300'
Kibana Docker Configuration: Kibana needs to connect to an instance of ElasticSearch so that visualizations can be made. Add ELASTICSEARCH_URL
 environment variable and specify ElasticSearch Instance to connect to. #5601 default port needs to be exposed.
kibana: image: kibana:4.1.2 links: - elasticsearch environment: - ELASTICSEARCH_URL=http://elasticsearch:9200 ports: - '5601:5601' depends_on: - elasticsearch
Logstash Docker Configuration: Logstash can process data from any source and normalizes it for storing. In the command section, you can note that Logstash will receive input on UDP protocol at port #5000 and pushes the data to ElasticSearch instance.
Also note, below is for demonstration purpose only but actually, Logstash can dynamically unify data from various sources and normalize the data into any of the destinations. You can also cleanse your data for diverse advanced downstream analytics and visualization use cases.
logstash:    image: logstash:2.1.1    environment:      - STDOUT=true    links:      - elasticsearch    depends_on:       - elasticsearch       - kibana    command: 'logstash -e "input { udp { port => 5000 } } output { elasticsearch { hosts => elasticsearch } }"'
We now have an ELK stack configuration ready. Next steps, we’ll explore how to push logs into the system using Logspout.
Logspout Docker Configuration: Logspout will monitor Docker events. When a new container is launched it will automatically start collecting its logs. Every log line will be pushed into Logstash using the UDP protocol. Below is the Docker configuration, we are using logspout v3 but there are the latest versions available.
 logspout:    image: gliderlabs/logspout:v3    command: 'udp://logstash:5000'    links:      - logstash    volumes:      - '/var/run/docker.sock:/tmp/docker.sock'    depends_on:       - elasticsearch       - logstash       - kibana
Finally, our Docker Compose configuration will look the one below
version: '3.3' services: logspout: image: gliderlabs/logspout:v3 command: 'udp://logstash:5000' links: - logstash volumes: - '/var/run/docker.sock:/tmp/docker.sock' depends_on: - elasticsearch - logstash - kibana logstash: image: logstash:2.1.1 environment: - STDOUT=true links: - elasticsearch depends_on: - elasticsearch - kibana command: 'logstash -e "input { udp { port => 5000 } } output { elasticsearch { hosts => elasticsearch } }"' kibana: image: kibana:4.1.2 links: - elasticsearch environment: - ELASTICSEARCH_URL=http://elasticsearch:9200 ports: - '5601:5601' depends_on: - elasticsearch elasticsearch: image: elasticsearch:1.5.2 ports: - '9200:9200' - '9300:9300'
Step #2.Launch Docker Containers
Now that Docker compose script is ready, launch containers using docker-compose up
command.
ElasticSearch and Kibana can take a few minutes to start. When Logstash launches it starts generating indexes in Elasticsearch.If you have noticed, we have not created any application which is generating logs, here we are going to use startup logs generated by ElasticSearch, Kibana & Logstash itself.
If you want to ignore logs for a specific container then you can add LOGSPOUT=ignore
as an environment variable on Docker compose a script. For more information on other Logspout environment variables, please check here.
Once all the containers are up, the next step is to launch Kibana and start defining the index pattern.
Step #3.Define Index Pattern
Now that startup logs would have been loaded to Elasticsearch, we would need to create an index pattern. An index is a collection of documents that have similar characteristics. An index is identified by a name and this name is used to refer to the index when performing indexing, search, update, and delete operations against the documents in it. Indexing is similar to the creation and update process of CRUD operations.
Launch Kibana on port # 5601,under ‘Indices’ / ‘Management’ (on latest versions) tab you can find option to create Index pattern.Enter the name of the index ex.logstash-*. Kibana will then ask for a field containing a time/timestamp which it should use for visualizing time-series data. for our case, this is the “@timestamp” field.
Now that we have created the index pattern, it would take a few minutes to complete. The next step is to create visualizations. Before that, we can check the data from the ‘Discover’ tab.
We have enough data to visualize, we are ready to create Visualization
Step #4.Visualize the data
For demonstration purposes, I have created the below visualizations and attached them to the Dashboard.
- Metrics chart to display count of log events
- Area chart to show count of logs events against time
Congrats! we have learned how to aggregate all Docker container logs and analyze the same centrally using ELK stack.
As always there is more to what was covered here! If you have questions, please post it in the comments section.
Like this post? Don’t forget to share it!
Useful Resources
- Elasticsearch reference
- Kibana reference
- Logstash reference
- Logspout Github
- Tutorial : Visualize historical data with ELK stack
- Docker tutorial – Build Docker image for your Angular 6 application
- Kubernetes Tutorial : Distributed tracing with Jaeger
- Monitoring Docker containers using Prometheus + cAdvisor + Grafana
[…] How to aggregate Docker Container logs and analyse with ELK stack ? […]