Today we are going to learn about the ELK stack, it consists of 3 powerful open-source tools Elasticsearch, Logstash, and Kibana. Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real-time. Kibana is like a window into the Elastic Stack. It enables visual exploration and real-time analysis of your data in Elasticsearch. Logstash is the central dataflow engine in the Elastic Stack for gathering, enriching, and unifying all of your data regardless of format or schema.
Quick Snapshot
Like mentioned before, Elasticsearch is a highly scalable search engine that runs on top of a Java-based Lucene engine. It is kind of a NoSQL database, it stores data in an unstructured format. Data would be inside the documents instead of tables and schemas.
Click here to learn more.
Kibana is an open-source analytics and visualization platform designed to work with Elasticsearch. You can search, view, and interact with data stored in Elasticsearch indices. Also, you can easily perform advanced data analysis and visualize your data in a variety of charts, tables, and maps.
Kibana has a browser-based interface that enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real-time.
Logstash is an open-source data collection engine with real-time pipelining capabilities. With Logstash you can dynamically unify data from various sources and normalize the data into any of the destinations. You can also cleanse your data for diverse advanced downstream analytics and visualization use cases.
Click here to learn more.
In this tutorial, we are going to look at how to create an index, analyze & visualize historical data using ELK stack. Here is an overview of the steps involved
Head over to the Downloads section and download all 3 of them, for this tutorial I’m using Windows 10 Platform. For other platforms, please use the respective packages.
Unzip the contents of the zip file.
Under the Elasticsearch\bin folder, execute elasticsearch.bat
to start the Elasticsearch engine.
Wait for the Elasticsearch server to start, look for started
keyword on the console.
To check whether the server has started properly or not, go to the Elasticsearch application with the default port# (9200
).
If you get the above output, this indicates your Elasticsearch server has started successfully.
Under Kibana*\bin folder, execute kibana.bat
to start the Kibana server
Wait for the Kibana server to start, look for server running
keyword on the console.
To check whether the server has started properly or not, go to the Kibana application with the default port# (5601
).
We have configured both Elasticsearch & Kibana, our next step is to load data using Logstash.
For the sample data set, I have downloaded it from Yahoo Finance Historical data. Below is a sample of the raw data.
The next step is to load the data from CSV files to Logstash. For this purpose, we are going to use the CSV plugin to get the data from the file source. Logstash can also read the number of other input sources too. To load the sample, we would need to create a simple config file. We are going to use the file input in this case. The input section of our configuration file looks like this:
input {
file {
path => "C:/Karthik/Blog/logstash-6.3.0/bin/data.csv"
type => "core2"
start_position => "beginning"
}
}
Now our input section is ready, next is to identify the data from the file, and optionally we can also cleanse/manipulate or any other operations with the source. For this we are going to use CSV filter plugin, if you’re looking for other sources like JSON etc.,, check out here. Also here we are going to convert all fields from the CSV file to a numeric data type (float) so that we can be able to visualize the data.
filter {
csv {
separator => ","
columns => ["Date","Open","High","Low","Close","Volume","Adj Close"]
}
mutate {convert => ["High", "float"]}
mutate {convert => ["Open", "float"]}
mutate {convert => ["Low", "float"]}
mutate {convert => ["Close", "float"]}
mutate {convert => ["Volume", "float"]}
}
The next step is to output data directly to Elasticsearch, we are using the Elasticsearch output. There is also an option for multiple output adapters for streaming to different outputs. In this case, we have added the stdout output for seeing the output in the console. Also, note we have to specify an index name for Elasticsearch. This index will be used later for configuring Kibana to visualize the dataset. Below, you can see the output section of our logstash.conf file.
output {
elasticsearch {
action => "index"
index => "stock"
hosts => "localhost"
workers => 1
}
stdout {}
}
Like we have discussed before,post loading data into Elasticsearch we would need to create an index pattern. An index is a collection of documents that have similar characteristics ex. stock data. An index is identified by a name and this name is used to refer to the index when performing indexing, search, update, and delete operations against the documents in it. Indexing is similar to the creation and update process of CRUD operations.
Make sure Kibana is running and log in to the console, under the ‘Management’ tab you can find the option to create an Index pattern under Kibana. Enter the name of the index that was specified before when inserting the data with Logstash (“stock”). Kibana will then ask for a field containing a timestamp which it should use for visualizing time-series data. for our case, this is the “Date” field.
Now that we have created the index pattern,next step is to create visualizations.
We are going to create three different types of visualizations and assemble them into one Dashboard. Choose “Visualize” from the top menu to create new visualization and choose the search source as ‘stock’.
We are going to create the following 3 visualizations
For the first visualization, choose visualization type as Area chart
and then for the data to use for the x- and y-axis. For the y-axis, we want to have the max value of the “High” field in our dataset. The x-axis is configured to be a date histogram showing the “Date” field in a daily interval.
Save the visualization so that we can use it later while we create a dashboard.
For the next one, choose visualization type as Vertical Bar chart
and then for the data to use for the x- and y-axis. For the y-axis, we want to have the max value of the “Volume” field in our dataset. The x-axis is configured to be a date histogram showing the “Date” field in a daily interval.
For Metrics visualization, choose visualization type as Metrics
and then to add 3 metrics Count, Max High & Volume High.
Now we have all 3 visualizations ready,next step is to create a dashboard.
To create a new dashboard, navigate to the “Dashboard” section in Kibana. You can now add the visualizations to the dashboard using the “+” icon in the upper right corner.
You can also drag and resize the widgets as you like to customize the dashboard. Also, if you notice you can filter data by zooming the charts or selecting a different time range.
Congrats! we have learned how to visualize historical data with ELK stack
Like this post? Don’t forget to share it!
As we wrap up 2024, it’s time to reflect on the incredible journey we’ve had…
Operating a business often entails balancing tight schedules, evolving market dynamics, and shifting consumer requirements.…
Of course, every site has different needs. In the end, however, there is one aspect…
In today's digital-first world, businesses must adopt effective strategies to stay competitive. Social media marketing…
62% of UX designers now use AI to enhance their workflows. Artificial intelligence (AI) rapidly…
The integration of artificial intelligence into graphic design through tools like Adobe Photoshop can save…
This website uses cookies.