Help Center/ Distributed Message Service for Kafka/ Best Practices/ Interconnecting Logstash with Kafka
Updated on 2023-09-15 GMT+08:00

Interconnecting Logstash with Kafka

Scenario

Logstash is a free and open server-side data processing pipeline that integrates data from multiple sources, converts it, and then sends it to the specified storage. Kafka is a high-throughput distributed message pub/sub system. It is one of the input and output sources of Logstash. The following describes how to interconnect Logstash with a Kafka instance.

Solution Architecture

  • The following figure shows Kafka as an input source of Logstash.
    Figure 1 Kafka as an input source of Logstash

    The log collection client sends data to the Kafka instance. Logstash pulls data from the Kafka instance based on its performance. Using a Kafka instance as the Logstash input source can prevent the impact of burst traffic on Logstash, and decouple the log collection client from Logstash to ensure system stability.

  • The following figure shows Kafka as an output source of Logstash.
    Figure 2 Kafka as an output source of Logstash

    Logstash collects data from the database and sends the data to the Kafka instance for storage. Using a Kafka instance as the Logstash output source can store a large amount of data thanks to the high throughput of Kafka.

Restrictions

Logstash 7.5 and later versions support Kafka Integration Plugin which includes the Kafka input plugin and Kafka output plugin. Kafka input plugin reads data from topics of Kafka instances, and Kafka output plugin writes data to topics of Kafka instances. Table 1 lists the version mapping between Logstash, Kafka Integration Plugin, and Kafka clients. Ensure that the Kafka client version is later than or equal to the Kafka instance version.

Table 1 Version mapping

Logstash Version

Kafka Integration Plugin Version

Kafka Client Version

8.3–8.8

10.12.0

2.8.1

8.0–8.2

10.9.0–10.10.0

2.5.1

7.12–7.17

10.7.4–10.9.0

2.5.1

7.8–7.11

10.2.0–10.7.1

2.4

7.6–7.7

10.0.1

2.3.0

7.5

10.0.0

2.1.0

Prerequisites

Make the following preparation before implementation.

  • Download Logstash.
  • Prepare a Windows host, install JDK v1.8.111 or later and Git Bash on the host, and configure related environment variables.
  • Create a Kafka instance and a topic, and obtain the instance information.

    If both public access and SASL authentication are disabled for the Kafka instance, obtain the information listed in Table 2.

    Table 2 Kafka instance information (public access and SASL authentication disabled)

    Parameter

    How to Obtain

    Instance address (private network)

    View it in the Connection area on the instance details page.

    Topic name

    On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.

    The following uses topic-logstash as an example.

    If public access is disabled and SASL authentication is enabled for the Kafka instance, obtain the information listed in Table 3.

    Table 3 Kafka instance information (public access disabled and SASL authentication enabled)

    Parameter

    How to Obtain

    Instance address (private network)

    View it in the Connection area on the instance details page.

    SASL mechanism

    View it in the Connection area on the instance details page.

    Security protocol

    View it in the Connection area on the instance details page.

    Certificate

    Click Download next to SSL Certificate in the Connection area on the instance details page. Download and decompress the package to obtain the client certificate file client.truststore.jks.

    SASL username and password

    On the Kafka console, click your instance. In the left navigation pane, choose Users to view the username. If you have forgotten the password, click Reset Password.

    Topic name

    On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.

    The following uses topic-logstash as an example.

    If public access is enabled and SASL authentication is disabled for the Kafka instance, obtain the information listed in Table 4.

    Table 4 Kafka instance information (public access enabled and SASL authentication disabled)

    Parameter

    How to Obtain

    Instance address (public network)

    View it in the Connection area on the instance details page.

    Topic name

    On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.

    The following uses topic-logstash as an example.

    If both public access and SASL authentication are enabled for the Kafka instance, obtain the information listed in Table 5.

    Table 5 Kafka instance information (public access and SASL authentication enabled)

    Parameter

    How to Obtain

    Instance address (public network)

    View it in the Connection area on the instance details page.

    SASL mechanism

    View it in the Connection area on the instance details page.

    Security protocol

    View it in the Connection area on the instance details page.

    Certificate

    Click Download next to SSL Certificate in the Connection area on the instance details page. Download and decompress the package to obtain the client certificate file client.truststore.jks.

    SASL username and password

    On the Kafka console, click your instance. In the left navigation pane, choose Users to view the username. If you have forgotten the password, click Reset Password.

    Topic name

    On the Kafka console, click your instance. In the left navigation pane, choose Topics to view the topic name.

    The following uses topic-logstash as an example.

Procedure (Kafka Instance as the Logstash Output Source)

  1. On the Windows host, decompress the Logstash package, go to the config folder, and create the output.conf configuration file.

    Figure 3 Creating the output.conf configuration file

  2. Add the following content to the output.conf file:

    input {
        stdin {}
    }
    output {
     kafka {
    	 bootstrap_servers => "ip1:port1,ip2:port2,ip3:port3"
    	 topic_id => "topic-logstash"
    	
    	# If SASL authentication is disabled, comment out the following options:
    	 # If the SASL mechanism is PLAIN, configure as follows:
    	 sasl_mechanism => "PLAIN"
    	 sasl_jaas_config => "org.apache.kafka.common.security.plain.PlainLoginModule required username='username' password='password';"
    	
    	 # If the SASL mechanism is SCRAM-SHA-512, configure as follows:
    	 sasl_mechanism => "SCRAM-SHA-512"
    	 sasl_jaas_config => "org.apache.kafka.common.security.scram.ScramLoginModule required username='username' password='password';"
    		
    	 # If the security protocol is SASL_SSL, configure as follows:
    	 security_protocol => "SASL_SSL"
    	 ssl_truststore_location => "C:\\Users\\Desktop\\logstash-8.8.1\\config\\client.jks"
    	 ssl_truststore_password => "dms@kafka"
    	 ssl_endpoint_identification_algorithm => ""
    	
    	# If the security protocol is SASL_PLAINTEXT, configure as follows:
    	 security_protocol => "SASL_PLAINTEXT"
    	 }
    }

    Description:

    • bootstrap_servers: private network connection address or public network connection address of the Kafka instance.
    • topics: topic name.
    • sasl_mechanism: SASL authentication mechanism.
    • sasl_jaas_config: SASL JAAS configuration file. Change the SASL username and password as required.
    • security_protocol: security protocol used by the Kafka instance.
    • ssl.truststore.location: location where the SSL certificate is stored.
    • ssl_truststore_password: server certificate password, which must be set to dms@kafka and cannot be changed.
    • ssl_endpoint_identification_algorithm: Indicates whether to verify the certificate domain name. If this option is left blank, the certificate domain name is not verified. In this example, leave it blank.

    For more information about Kafka output plugin options, see Kafka output plugin.

  3. Open Git Bash in the root directory of the Logstash folder and run the following command to start Logstash:

    ./bin/logstash -f ./config/output.conf

    If the message "Successfully started Logstash API endpoint" is displayed, Logstash has been started.

    Figure 4 Starting Logstash

  4. In Logstash, produce messages, as shown in the following figure.

    Figure 5 Producing messages

  5. Go to the Kafka console and click your instance.
  6. In the left navigation pane, choose Message Query.
  7. Select topic-logstash from the Topic Name drop-down list box and click Search to query messages.

    Figure 6 Querying messages

    As shown in Figure 6, the Kafka output plugin of Logstash has written data to topic-logstash of the Kafka instance.

Procedure (Kafka Instance as the Logstash Input Source)

  1. On the Windows host, decompress the Logstash package, go to the config folder, and create the input.conf configuration file.

    Figure 7 Creating the input.conf configuration file

  2. Add the following content to the input.conf file to connect to the Kafka instance:

    input {
     kafka {
    	 bootstrap_servers => "ip1:port1,ip2:port2,ip3:port3"
    	 group_id => "logstash_group"
    	 topic_id => "topic-logstash"
    	 auto_offset_reset => "earliest"
    	
    	# If SASL authentication is disabled, comment out the following options:
    	 #If the SASL mechanism is PLAIN, configure as follows:
    	 sasl_mechanism => "PLAIN"
    	 sasl_jaas_config => "org.apache.kafka.common.security.plain.PlainLoginModule required username='username' password='password';"
    	
    	 # If the SASL mechanism is SCRAM-SHA-512, configure as follows:
    	 sasl_mechanism => "SCRAM-SHA-512"
    	 sasl_jaas_config => "org.apache.kafka.common.security.scram.ScramLoginModule required username='username' password='password';"
    		
    	 # If the security protocol is SASL_SSL, configure as follows:
    	 security_protocol => "SASL_SSL"
    	 ssl_truststore_location => "C:\\Users\\Desktop\\logstash-8.8.1\\config\\client.jks"
    	 ssl_truststore_password => "dms@kafka"
    	 ssl_endpoint_identification_algorithm => ""
    	
    	# If the security protocol is SASL_PLAINTEXT, configure as follows:
    	 security_protocol => "SASL_PLAINTEXT"
    	 }
    }
    output {
     stdout{codec=>rubydebug}
    }

    Description:

    • bootstrap_servers: private network connection address or public network connection address of the Kafka instance.
    • group_id: consumer group name.
    • topics: topic name.
    • auto_offset_reset: consumers' consumption policy. This example uses earliest.
    • sasl_mechanism: SASL authentication mechanism.
    • sasl_jaas_config: SASL JAAS configuration file. Change the SASL username and password as required.
    • security_protocol: security protocol used by the Kafka instance.
    • ssl.truststore.location: location where the SSL certificate is stored.
    • ssl_truststore_password: server certificate password, which must be set to dms@kafka and cannot be changed.
    • ssl_endpoint_identification_algorithm: Indicates whether to verify the certificate domain name. If this option is left blank, the certificate domain name is not verified. In this example, leave it blank.

    For more information about Kafka input plugin options, see Kafka input plugin.

  3. Open Git Bash in the root directory of the Logstash folder and run the following command to start Logstash:

    ./bin/logstash -f ./config/input.conf

    After Logstash is started successfully, the Kafka input plugin automatically reads data from topic-logstash of the Kafka instance, as shown in the following figure.

    Figure 8 Logstash reading data from topic-logstash