Connecting Elasticsearch to Flume ESSink
Scenario
The Solr and Elasticsearch components of MRS depend on Lucene of different versions. Therefore, Flume can connect only to Solr or Elasticsearch at the same time. For compatibility purposes, Flume connects to Solr by default. To connect Flume to Elasticsearch, you need to adjust Lucene.
This section describes how to adjust the Lucene JAR package.
Procedure
The Flume service can either run on a server or on a client. Perform the following adjustments based on the running location of ESSink:
- Go to the lib directory in the Flume installation directory, for example, ${BIGDATA_HOME}/FusionInsight_Porter_8.1.0.1/install/FusionInsight-Flume-Flume component version/flume/lib/, record the permissions and owner groups of all lucene-* files, and back up the files. Then, delete all lucene-* files.
If Flume is running on a client, perform this step in the lib directory of the Flume client installation directory.
- Go to the lib directory on the Elasticsearch server, for example, ${BIGDATA_HOME}/FusionInsight_Elasticsearch_8.1.0.1/install/FusionInsight-Elasticsearch-7.10.2/elasticsearch/lib, and collect all packages on which Elasticsearch depends. The package names start with Lucene.
- Copy the JAR files collected in 2 to the lib directory in the Flume installation directory, for example, ${BIGDATA_HOME}/FusionInsight_Porter_8.1.0.1/install/FusionInsight-Flume-Flume component version/flume/lib/, and change the permission and owner group of the new JAR files to be the same as those of the original ones.
If Flume is running on a client, perform this step in the lib directory of the Flume client installation directory.
- Restart the corresponding Flume instance processes. If Flume is running on the client, restart the Flume agent on the client.
Log in to FusionInsight Manager and choose Cluster > Services > Flume. On the page that is displayed, click the Instance tab. In the instance list, select the instance to be restarted and choose More > Restart Instance. In the displayed dialog box, enter the password and click OK. Wait until the instance is restarted.
To deploy ESSink on multiple hosts, perform the preceding steps on each host.
Table 1 ESSink configuration Parameter
Default Value
Description
type
com.*.flume.sinks.elasticsearch.ESSink
The default value is a type name.
servers
-
EsNode list. Values are in the IP address:port format. The port is the TRANSPORT_TCP_PORT or SERVER_PORT of EsNode.
NOTE:All IP addresses and ports of the EsNodes need to be configured for Flume fault migration.
client
transport
Elasticsearch client connection type. The value can be transport or rest.
securityEnable
false
Whether to enable the security mode for the Elasticsearch cluster
clusterName
elasticsearch_cluster
Name of the Elasticsearch cluster
NOTE:If multiple Elasticsearch services are installed in the cluster, this parameter must correspond to each service and cannot be only set to elasticsearch_cluster. For example, configure this parameter as follows:
elasticsearch_cluster for the Elasticsearch service
elasticsearch-1_cluster for the Elasticsearch-1 service
batchSize
1000
Number of events written to the Channel in batches.
indexName
-
Index name.
indexType
-
Index type.
rest.callbackConnectTimeout
5000
Timeout interval for connecting to the RequestConfigCallback of the REST client.
rest.callbackSocketTimeout
60000
Timeout interval for session with the RequestConfigCallback of the REST client.
rest.builderMaxTimeout
60000
Maximum retry timeout interval of the REST client RestClientBuilder.
serializer
-
Serializer. Two options are provided:
com.*.flume.sinks.elasticsearch.ElasticSearchLogStashEventSerializer com.*.flume.sinks.elasticsearch.ElasticSearchDynamicSerializer
Default value:
com.*.flume.sinks.elasticsearch.ElasticSearchLogStashEventSerializer
indexNameBuilder
-
Index name builder. Two options are provided:
com.*.flume.sinks.elasticsearch.TimeBasedIndexNameBuilder
com.*.flume.sinks.elasticsearch.SimpleIndexNameBuilder
Default value:
com.*.flume.sinks.elasticsearch.TimeBasedIndexNameBuilder
channel
-
Channel connected to ESSink.
When Flume is running and is connected to the secure Elasticsearch, upload the configured jaas.conf, krb5.conf, and user keytab files to the corresponding directories using WinSCP.
- If Flume runs on the server, upload the package to the etc directory on the server, for example, ${BIGDATA_HOME}/FusionInsight_Porter_8.1.0.1/1_11_Flume/etc/.
- If Flume runs on the client, upload the file to the conf directory on the client, for example, /opt/flumeClient/fusioninsight-flume-Flume component version/conf. For details about the jaas.conf configuration, see Common Issues About Flume. The file permission must be the same as that of the file in the corresponding directory, and the jaas.conf configuration file must start with EsClient.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot