Using Elasticsearch, In-House Built Logstash, and Kibana to Build a Log Management Platform
A unified log management platform built using a CSS Elasticsearch cluster can manage logs in real time in a unified and convenient manner, enabling log-driven O&M and improving service management efficiency.
Scenarios
- Log Management: Centrally manage application and system logs to quickly identify faults.
- Security Monitoring: Detect and respond to security threats, detect intrusions, and analyze abnormal behaviors.
- Service Analysis: Analyze user behaviors to optimize products and services.
- Performance Monitoring: Monitor system and application performance in real-time to detect bottlenecks.
Overview
- Elasticsearch is an open-source, distributed search and analytics engine used to store, search, and analyze large volumes of data.
- Logstash is a server-side data pipeline that collects, parses, and enriches data before sending it to Elasticsearch.
- Kibana provides an open-source data analysis and visualization platform for Elasticsearch, enabling users to search, view, and interact with the data stored in Elasticsearch.
- Beats, such as Filebeat and Metricbeat, are lightweight data collectors installed on servers to collect and forward data to Logstash.
Figure 1 shows the architecture of the log management platform using Elasticsearch and Logstash.
- Collection
- As a data collector, Beats gathers data from various sources and send it to Logstash.
- Logstash can independently collect data or receive it from Beats to filter, transform, and enhance the data.
- Data processing
Before sending data to Elasticsearch, Logstash performs necessary processing, such as parsing structured logs and filtering out irrelevant information.
- Data storage
As a core storage component, Elasticsearch indexes and stores data from Logstash, providing quick search and data retrieval functions.
- Data analysis and visualization
Kibana is used to analyze and visualize data in Elasticsearch, allowing users to create dashboards and reports to visualize the data.
For details about the version compatibility of ELKB components, see https://www.elastic.co/support/matrix#matrix_compatibility.
Advantages
- Real-time: Provide real-time data collection and analysis capabilities.
- Flexibility: Support various data sources and flexible data processing flows.
- Ease of use: The user-friendly interface simplifies data operations and visualization.
- Scalability: Offer strong horizontal expansion capabilities, enabling the processing of petabyte-level data.
Prerequisites
- You have created an Elasticsearch cluster in non-security mode.
- You have applied for an ECS and installed the Java environment. For details about how to purchase an ECS, see Purchasing and Logging In to a Linux ECS.
Procedure
- Log in to the ECS, deploy and configure Filebeat.
- Download Filebeat. The recommended version is 7.6.2. Download it at https://www.elastic.co/downloads/past-releases#filebeat-oss.
- Configure the Filebeat configuration file filebeat.yml. For example, to collect all the files whose names end with log in the /root/ directory, configure the filebeat.yml file is as follows:
filebeat.inputs: - type: log enabled: true # Path of the collected log file paths: - /root/*.log filebeat.config.modules: path: ${path.config}/modules.d/*.yml reload.enabled: false # Logstash hosts information output.logstash: hosts: ["192.168.0.126:5044"] processors:
- Deploy and configure Logstash in-house.
To achieve better performance, you are advised to set the JVM parameter to half of the ECS or docker memory for in-house built Logstash.
- Download Logstash. The recommended version is 7.6.2. Download it at https://www.elastic.co/downloads/past-releases#logstash-oss.
- Ensure that Logstash and the CSS cluster are connected. Run the curl http:// {ip}:{port} command on the VM to test the connectivity between the VM and the Elasticsearch cluster. If 200 is returned, they are connected.
- Configure the Logstash configuration file logstash-sample.conf.
The content of the logstash-sample.conf file is as follows:
input { beats { port => 5044 } } # Split data. filter { grok { match => { "message" => '\[%{GREEDYDATA:timemaybe}\] \[%{WORD:level}\] %{GREEDYDATA:content}' } } mutate { remove_field => ["@version","tags","source","input","prospector","beat"] } } # CSS cluster information output { elasticsearch { # Destination cluster node addresses. No need to include the protocol. hosts => ["xxx.xxx.xxx.xxx:9200", "xxx.xxx.xxx.xxx:9200"] # Target index configuration index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" # Mandatory fields for a security-mode cluster. (Delete them for a cluster with the security mode disabled.) # user => "xxx" # Username for accessing the cluster. # password => "xxx" # Password corresponding to the username. # If SSL is enabled for the destination cluster, additionally configure the following information: # ssl => true # cacert => "/opt/logstash/extend/certs" # Path of the CA certificate used to verify the destination cluster. # ssl_certificate_verification => false # Whether to enable SSL certificate verification for the destination cluster. } }Table 1 Configuration items Item
Description
output
hosts
Destination cluster node addresses. You can configure multiple IP addresses.
Value format: ["<Node IP address 1>:<Port number>", "<Node IP address 2>:<Port number>"]
index
Name of the index to which events are written.
- Single index: Enter the index name, for example, my_index.
- Multiple indexes: Use dynamic naming (based on event fields) or configure multiple conditional output blocks to route events to different indexes.
user
Username for accessing the destination cluster.
Mandatory for a security-mode cluster.
password
Password for accessing the destination cluster.
Mandatory for a security-mode cluster.
ssl
Whether SSL is enabled for the destination cluster.
The value can be:
- true: Uses HTTPS to transmit data.
- false: Uses HTTP to transmit data.
cacert
Path of the CA certificate used to verify the destination cluster.
Value format: <certificate path><certificate name>, for example, /opt/logstash/extend/certs.
- If the destination is a CSS Elasticsearch or OpenSearch cluster, the certificate name and certificate path of the default CA certificate will be obtained. For details, see Viewing Default Certificates.
- If the destination is a self-managed or third-party Elasticsearch or OpenSearch cluster, upload the destination cluster's security certificate to Logstash and obtain the certificate name and certificate path. For details, see Uploading a Custom Certificate.
ssl_certificate_verification
Whether SSL certificate verification is enabled for the destination cluster.
The value can be:
- true (default): Uses an SSL certificate to verify the destination cluster.
- false: Ignores SSL certificate verification.
- Configure an index template in an Elasticsearch cluster.
- Log in to the CSS management console.
- In the navigation pane on the left, choose Clusters > Elasticsearch.
- In the displayed cluster list, find the target cluster, and click Access Kibana in the Operation column to log in to the Kibana console.
- In the navigation pane on the left, choose Dev Tools.
- Create an index template.
For example, create an index template. Let the index use three shards and no replicas. Fields such as @timestamp, content, host.name, level, log.file.path, message and timemaybe are defined in the index.
PUT _template/filebeat { "index_patterns": ["filebeat*"], "settings": { # Define the number of shards. "number_of_shards": 3, # Define the number of copies. "number_of_replicas": 0, "refresh_interval": "5s" }, # Define a field. "mappings": { "properties": { "@timestamp": { "type": "date" }, "content": { "type": "text" }, "host": { "properties": { "name": { "type": "text" } } }, "level": { "type": "keyword" }, "log": { "properties": { "file": { "properties": { "path": { "type": "text" } } } } }, "message": { "type": "text" }, "timemaybe": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||strict_date_optional_time||epoch_millis||EEE MMM dd HH:mm:ss zzz yyyy" } } } }
- Prepare test data on ECS.
Run the following command to generate test data and write the data to /root/tmp.log:
bash -c 'while true; do echo [$(date)] [info] this is the test message; sleep 1; done;' >> /root/tmp.log &
The following is an example of the generated test data:
[Thu Feb 13 14:01:16 CST 2020] [info] this is the test message
- Run the following command to start Logstash:
nohup ./bin/logstash -f /opt/pht/logstash-6.8.6/logstash-sample.conf &
- Run the following command to start Filebeat:
./filebeat
- Use Kibana to query data and create reports.
- Enter the Kibana page of the Elasticsearch cluster.
- Click Discover in the navigation tree on the left, as shown in Figure 2.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot

