Updated on 2026-04-30 GMT+08:00

Data Ingestion Methods

When building an enterprise-grade search and analytics platform, you may need to ingest large volumes of heterogeneous data distributed across relational databases (such as MySQL and Oracle), messaging systems (such as Kafka), object storage services (such as OBS), and business applications into an OpenSearch cluster. To meet varying requirements for data latency, data volume, and development costs, CSS OpenSearch clusters provide multiple data ingestion methods. This topic briefly introduces and compares these methods, including their applicable scenarios and characteristics, to help you select an optimal option.

Before ingesting large amounts of data, you can enhance the destination OpenSearch cluster's ingestion performance to improve throughput. For details, see Enhancing Data Ingestion Performance.

Table 1 Comparing different data ingestion methods for OpenSearch clusters

Method

When to Use

Supported Source/Format

Details

CSS Logstash

You want to perform complex data cleaning (filtering) without building and managing your own Logstash server.

For example, you may need to clean Nginx logs from Kafka before ingesting them into OpenSearch.

MySQL, Kafka, and OBS

Using Logstash to Synchronize Data to Elasticsearch

Cloud Data Migration (CDM)

You need to perform a full migration of historical data. This method is code-free and guided by a step-by-step wizard.

For example, you may ingest years of archived logs from OBS or historical orders from Oracle into OpenSearch.

OBS (JSON/CSV), Oracle, and MySQL

Migrating Data Using CDM

Open-source Logstash

You want to migrate data from an on-premises IDC to OpenSearch clusters on the cloud, during which you may use special plugins or deeply customized pipeline logic.

For example, you may upload system logs from an on-premises data center to OpenSearch via an SSH tunnel.

Any source supported by Logstash input, such as JSON, CSV, and text

Ingesting Data Using Self-Managed Logstash

Open-source APIs

You need to quickly write small amounts of data to OpenSearch during development or debugging.

For example, you can have Java/Python applications directly call OpenSearch APIs to write data.

JSON

Ingesting Data Using Open-Source OpenSearch APIs