Updated on 2026-05-20 GMT+08:00

Elasticsearch

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It is used for full-text retrieval, log analysis, real-time data query, and large-scale data aggregation.

Huawei Cloud Cloud Search Service (CSS) is a fully hosted distributed search service powered by open-source Elasticsearch. CSS links can be used to migrate log files and database records to CSS for search and analysis using Elasticsearch.

DataArts Migration supports open-source Elasticsearch which is compatible with Huawei Cloud CSS, and provides stable and efficient data integration capabilities.

Preparation and Constraints

  • Network requirements

    The Elasticsearch data source can communicate with CDM. This ensures smooth data transmission. For details, see Enabling Network Connectivity.

  • Required permissions
    • Huawei Cloud CSS permissions:
      • Read permission: DataArts Migration reads cluster information from CSS. You can assign the CSS ReadOnlyAccess policy or a custom read-only permission in IAM. This permission allows you to perform read operations, such as querying the cluster list, viewing cluster details, obtaining monitoring metrics, and viewing snapshot information.
      • Write permission: DataArts Migration creates or changes cluster resources in CSS. You can assign the CSS FullAccess policy or custom read and write permissions in IAM. These permissions allows all read operations.
    • Open-source Elasticsearch permissions:
      • Read permission: DataArts Migration reads index data. You can assign the built-in read role and bind the role to the corresponding indexes in Elasticsearch.
      • Write permission: DataArts Migration writes, updates, and deletes documents. You can assign the write role (or index role) in Elasticsearch.
  • Enabling ports

    Elasticsearch port (9200): TCP 9200 must be enabled so that DataArts Migration can access Elasticsearch.

Supported Data Types

The following table lists the supported Elasticsearch data types.

Category

Type

Read

Write

Character

keyword

text

string

Integer

short

integer

long

Numeric

double

float

Boolean

boolean

Object

object

Nested type

nested

Date

date

Special type

ip

Array

string_array

short_array

integer_array

long_array

float_array

double_array

Value range

completion

Supported Migration Scenarios

DataArts Migration supports the following modes for synchronizing on-premises data:

  • Single table synchronization

    DataArts Migration supports table/file synchronization in data ingestion into a data lake or data migration to the cloud.

  • Database and table shard synchronization

    DataArts Migration supports synchronization of data from multiple databases and tables in data ingestion into a data lake or data migration to the cloud.

  • Entire DB migration

    DataArts Migrations supports synchronization of data from an on-premises database in data ingestion into a data lake or data migration to the cloud.

Database and table shard synchronization and entire DB migration are not supported in all regions. The following table lists the supported Elasticsearch migration scenarios.

Supported Migration Scenario

Single Table Read

Single Table Write

Database/Table Shard Read

Database/Table Shard Write

Entire DB Read

Entire DB Write

Supported

x

x

x

Core Capabilities

  • Connection configuration

    Configuration Item

    Supported

    Description

    Support for Secure Shell (SSL)

    SSL encryption ensures secure data transmission. Currently, this function is not supported.

  • Read capabilities

    Configuration Item

    Supported

    Description

    Incremental read

    The filter condition can be configured to enable incremental read.

    Shard concurrency

    x

    3.x and later versions support concurrent shard read, fully utilizing resources and improving read performance.

    Custom fields

    You can add computed columns, constant columns, or masking functions for tasks to meet personalized service requirements.

    Dirty data processing

    Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.

  • Write capabilities

    Configuration Item

    Supported

    Description

    Data clearance before import

    Data can be cleansed and processed before imported.

    Conflict resolution

    UPSERT, UPDATE, INDEX, and CREATE operations can flexibly handle data conflicts.

    Concurrent write

    Concurrent write improves efficiency.

    Batch submission

    Commit Size can be set to submit data to the server in batches.

    Dirty data processing

    Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.

Creating a Data Source

Create a data source in Management Center. For details, see Configuring Data Connection Parameters.

Creating an Offline Data Migration Job

Create an Elasticsearch migration job in DataArts Factory. For details, see Creating an Offline Processing Migration Job.