Interconnecting Logstash with Kafka
Scenario
Logstash is a free and open server-side data processing pipeline that integrates data from multiple sources, converts it, and then sends it to the specified storage. Kafka is a high-throughput distributed message pub/sub system. It is one of the input and output sources of Logstash. The following describes how to interconnect Logstash with a Kafka instance.
Solution Architecture
- The following figure shows Kafka as an input source of Logstash.
Figure 1 Kafka as an input source of Logstash
The log collection client sends data to the Kafka instance. Logstash pulls data from the Kafka instance based on its performance. Using a Kafka instance as the Logstash input source can prevent the impact of burst traffic on Logstash, and decouple the log collection client from Logstash to ensure system stability.
- The following figure shows Kafka as an output source of Logstash.
Figure 2 Kafka as an output source of Logstash
Logstash collects data from the database and sends the data to the Kafka instance for storage. Using a Kafka instance as the Logstash output source can store a large amount of data thanks to the high throughput of Kafka.
Restrictions
Logstash 7.5 and later versions support Kafka Integration Plugin which includes the Kafka input plugin and Kafka output plugin. Kafka input plugin reads data from topics of Kafka instances, and Kafka output plugin writes data to topics of Kafka instances. Table 1 lists the version mapping between Logstash, Kafka Integration Plugin, and Kafka clients. Ensure that the Kafka client version is later than or equal to the Kafka instance version.
Prerequisites
Make the following preparation before implementation.
- Download Logstash.
- Prepare a Windows host, install JDK v1.8.111 or later and Git Bash on the host, and configure related environment variables.
- Create a Kafka instance and a topic, and obtain the instance information.
If both public access and SASL authentication are disabled for the Kafka instance, obtain the information listed in Table 2.
Table 2 Kafka instance information (public access and SASL authentication disabled) Parameter
How to Obtain
Instance address (private network)
View it in the Connection area on the instance details page.
Topic name
On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.
The following uses topic-logstash as an example.
If public access is disabled and SASL authentication is enabled for the Kafka instance, obtain the information listed in Table 3.
Table 3 Kafka instance information (public access disabled and SASL authentication enabled) Parameter
How to Obtain
Instance address (private network)
View it in the Connection area on the instance details page.
SASL mechanism
View it in the Connection area on the instance details page.
Security protocol
View it in the Connection area on the instance details page.
Certificate
Click Download next to SSL Certificate in the Connection area on the instance details page. Download and decompress the package to obtain the client certificate file client.truststore.jks.
SASL username and password
On the Kafka console, click your instance. In the left navigation pane, choose Users to view the username. If you have forgotten the password, click Reset Password.
Topic name
On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.
The following uses topic-logstash as an example.
If public access is enabled and SASL authentication is disabled for the Kafka instance, obtain the information listed in Table 4.
Table 4 Kafka instance information (public access enabled and SASL authentication disabled) Parameter
How to Obtain
Instance address (public network)
View it in the Connection area on the instance details page.
Topic name
On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.
The following uses topic-logstash as an example.
If both public access and SASL authentication are enabled for the Kafka instance, obtain the information listed in Table 5.
Table 5 Kafka instance information (public access and SASL authentication enabled) Parameter
How to Obtain
Instance address (public network)
View it in the Connection area on the instance details page.
SASL mechanism
View it in the Connection area on the instance details page.
Security protocol
View it in the Connection area on the instance details page.
Certificate
Click Download next to SSL Certificate in the Connection area on the instance details page. Download and decompress the package to obtain the client certificate file client.truststore.jks.
SASL username and password
On the Kafka console, click your instance. In the left navigation pane, choose Users to view the username. If you have forgotten the password, click Reset Password.
Topic name
On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.
The following uses topic-logstash as an example.
Procedure (Kafka Instance as the Logstash Output Source)
- On the Windows host, decompress the Logstash package, go to the config folder, and create the output.conf configuration file.
Figure 3 Creating the output.conf configuration file
- Add the following content to the output.conf file:
input { stdin {} } output { kafka { bootstrap_servers => "ip1:port1,ip2:port2,ip3:port3" topic_id => "topic-logstash" # If SASL authentication is disabled, comment out the following options: # If the SASL mechanism is PLAIN, configure as follows: sasl_mechanism => "PLAIN" sasl_jaas_config => "org.apache.kafka.common.security.plain.PlainLoginModule required username='username' password='password';" # If the SASL mechanism is SCRAM-SHA-512, configure as follows: sasl_mechanism => "SCRAM-SHA-512" sasl_jaas_config => "org.apache.kafka.common.security.scram.ScramLoginModule required username='username' password='password';" # If the security protocol is SASL_SSL, configure as follows: security_protocol => "SASL_SSL" ssl_truststore_location => "C:\\Users\\Desktop\\logstash-8.8.1\\config\\client.jks" ssl_truststore_password => "dms@kafka" ssl_endpoint_identification_algorithm => "" # If the security protocol is SASL_PLAINTEXT, configure as follows: security_protocol => "SASL_PLAINTEXT" } }
Description:
- bootstrap_servers: private network connection address or public network connection address of the Kafka instance.
- topics: topic name.
- sasl_mechanism: SASL authentication mechanism.
- sasl_jaas_config: SASL JAAS configuration file. Change the SASL username and password as required.
- security_protocol: security protocol used by the Kafka instance.
- ssl.truststore.location: location where the SSL certificate is stored.
- ssl_truststore_password: server certificate password, which must be set to dms@kafka and cannot be changed.
- ssl_endpoint_identification_algorithm: Indicates whether to verify the certificate domain name. If this option is left blank, the certificate domain name is not verified. In this example, leave it blank.
For more information about Kafka output plugin options, see Kafka output plugin.
- Open Git Bash in the root directory of the Logstash folder and run the following command to start Logstash:
./bin/logstash -f ./config/output.conf
If the message "Successfully started Logstash API endpoint" is displayed, Logstash has been started.
Figure 4 Starting Logstash
- In Logstash, produce messages, as shown in the following figure.
Figure 5 Producing messages
- Go to the Kafka console and click your instance.
- In the left navigation pane, choose Message Query.
- Select topic-logstash from the Topic Name drop-down list box and click Search to query messages.
As shown in Figure 6, the Kafka output plugin of Logstash has written data to topic-logstash of the Kafka instance.
Procedure (Kafka Instance as the Logstash Input Source)
- On the Windows host, decompress the Logstash package, go to the config folder, and create the input.conf configuration file.
Figure 7 Creating the input.conf configuration file
- Add the following content to the input.conf file to connect to the Kafka instance:
input { kafka { bootstrap_servers => "ip1:port1,ip2:port2,ip3:port3" group_id => "logstash_group" topic_id => "topic-logstash" auto_offset_reset => "earliest" # If SASL authentication is disabled, comment out the following options: #If the SASL mechanism is PLAIN, configure as follows: sasl_mechanism => "PLAIN" sasl_jaas_config => "org.apache.kafka.common.security.plain.PlainLoginModule required username='username' password='password';" # If the SASL mechanism is SCRAM-SHA-512, configure as follows: sasl_mechanism => "SCRAM-SHA-512" sasl_jaas_config => "org.apache.kafka.common.security.scram.ScramLoginModule required username='username' password='password';" # If the security protocol is SASL_SSL, configure as follows: security_protocol => "SASL_SSL" ssl_truststore_location => "C:\\Users\\Desktop\\logstash-8.8.1\\config\\client.jks" ssl_truststore_password => "dms@kafka" ssl_endpoint_identification_algorithm => "" # If the security protocol is SASL_PLAINTEXT, configure as follows: security_protocol => "SASL_PLAINTEXT" } } output { stdout{codec=>rubydebug} }
Description:
- bootstrap_servers: private network connection address or public network connection address of the Kafka instance.
- group_id: consumer group name.
- topics: topic name.
- auto_offset_reset: consumers' consumption policy. This example uses earliest.
- sasl_mechanism: SASL authentication mechanism.
- sasl_jaas_config: SASL JAAS configuration file. Change the SASL username and password as required.
- security_protocol: security protocol used by the Kafka instance.
- ssl.truststore.location: location where the SSL certificate is stored.
- ssl_truststore_password: server certificate password, which must be set to dms@kafka and cannot be changed.
- ssl_endpoint_identification_algorithm: Indicates whether to verify the certificate domain name. If this option is left blank, the certificate domain name is not verified. In this example, leave it blank.
For more information about Kafka input plugin options, see Kafka input plugin.
- Open Git Bash in the root directory of the Logstash folder and run the following command to start Logstash:
./bin/logstash -f ./config/input.conf
After Logstash is started successfully, the Kafka input plugin automatically reads data from topic-logstash of the Kafka instance, as shown in the following figure.
Figure 8 Logstash reading data from topic-logstash
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.