Updated on 2026-04-30 GMT+08:00

Data Ingestion Methods

When building an enterprise-grade search and analytics platform, you may need to ingest large volumes of heterogeneous data distributed across relational databases (such as MySQL and Oracle), messaging systems (such as Kafka), object storage services (such as OBS), and business applications into an Elasticsearch cluster. To meet varying requirements for data latency, data volume, and development costs, CSS Elasticsearch clusters provide multiple data ingestion methods. This topic briefly introduces and compares these methods, including their applicable scenarios and characteristics, to help you select an optimal option.

Before ingesting large amounts of data, you can enhance the destination Elasticsearch cluster's ingestion performance to improve throughput. For details, see Enhancing Data Ingestion Performance.

Table 1 Comparing different data ingestion methods for Elasticsearch clusters

Method

When to Use

Supported Source/Format

Details

CSS Logstash

You want to perform complex data cleaning (filtering) without building and managing your own Logstash server.

For example, you may need to clean Nginx logs from Kafka before ingesting them into Elasticsearch.

MySQL, Kafka, and OBS

Using Logstash to Synchronize Data to Elasticsearch

Cloud Data Migration (CDM)

You need to perform a full migration of historical data. This method is code-free and guided by a step-by-step wizard.

For example, you may ingest years of archived logs from OBS or historical orders from Oracle into Elasticsearch.

OBS (JSON/CSV), Oracle, and MySQL

Migrating Data Using CDM

Open-source Logstash

You want to migrate data from an on-premises IDC to Elasticsearch clusters on the cloud, during which you may use special plugins or deeply customized pipeline logic.

For example, you may upload system logs from an on-premises data center to Elasticsearch via an SSH tunnel.

Any source supported by Logstash input, such as JSON, CSV, and text

Ingesting Data Using Self-Managed Logstash

Open-source APIs

You need to quickly write small amounts of data to Elasticsearch during development or debugging.

For example, you can have Java/Python applications directly call Elasticsearch APIs to write data.

JSON

Ingesting Data Using Open-Source Elasticsearch APIs