Updated on 2024-10-24 GMT+08:00

From Elasticsearch or CSS

If the source link of a job is a link described in Elasticsearch Link Parameters or CSS Link Parameters, configure the source job parameters based on Table 1.

Table 1 Job parameters when Elasticsearch or CSS is the source

Category

Parameter

Description

Example Value

Basic parameters

Index

Elasticsearch index, which is similar to the name of a relational database. The index name can contain only lowercase letters.

index

Type

Elasticsearch type, which is similar to the table name of a relational database. The type name can contain only lowercase letters.

NOTE:

Elasticsearch 7.x and later versions do not support custom types. Instead, only the _doc type can be used. In this case, this parameter does not take effect even if it is set.

_doc

Advanced attributes

Split Nested Field

(Optional) Whether to split the JSON content of the nested fields. For example, a:{ b:{ c:1, d:{ e:2, f:3 } } } can be split into a.b.c, a.b.d.e, and a.b.d.f.

No

Filter Conditions

(Optional) CDM migrates only the data that meets the filter conditions.
  • Currently, only the query string (q syntax) of Elasticsearch can be used to filter source data. The q syntax is used in the following way:
    • In exact match, the column:data format is used to match and filter data. column indicates the field name, and data indicates the query condition, for example, last_name:Smith.

      In addition, if data is a string containing spaces, it must be enclosed in double quotation marks. If column is not specified, all fields will be matched by data.

    • Multiple query conditions can be combined with connection words. The format is column1:data1 AND column2:data2. The connection words can be AND, OR, or NOT. They must be in uppercase, and there must be a space before and after each connection word.

      Example: first_name:Alec AND last_name:John

    • In range matching, you can directly use a condition expression to filter data. The expression is in column:>data format. The operator can be >, >=, <, or <=.

      An example is time:>=1636905600000 AND time:<1637078400000. It can also be used together with a macro variable of date and time, for example, createTime:>=${timestamp(dateformat(yyyyMMdd,-1,DAY))} AND createTime:< ${timestamp(dateformat(yyyyMMdd))}.

    • In range matching, you can also use the range syntax to filter data. The format is column:{data1 TO data2}. { and } indicate that a value is not included. [ and ] indicate that a value is included. TO must be capitalized, and there must be a space before and after it. * indicates all data.

      For example, time:{1636992000000 TO *] filters out all the data greater than 1636992000000 in the time field. It can also be used together with a macro variable of date and time, for example, createTime:[${timestamp(dateformat(yyyyMMdd,-1,DAY))} TO ${timestamp(dateformat(yyyyMMdd))}}.

  • Source data cannot be filtered using the query domain-specific language (DSL) of Elasticsearch.

last_name:Smith

Extract Meta-field

Whether to extract index meta-fields. For example, _index, _type, _id, and _score.

Yes

Page size

Elasticsearch page size

1000

ScrollId Time Out

During a scroll query using Elasticsearch, a scroll_id is recorded. When the query times out or is complete, the recorded srcoll_id will be cleared. You can set this parameter to specify the timeout duration.

5