Updated on 2024-04-19 GMT+08:00

string_split

The string_split function splits a target string into substrings based on the specified separator and returns a substring list.

Description

string_split(target, separator)
Table 1 string_split parameters

Parameter

Data Types

Description

target

STRING

Target string to be processed

NOTE:
  • If target is NULL, an empty line is returned.
  • If target contains two or more consecutive separators, an empty substring is returned.
  • If target does not contain a specified separator, the original string passed to target is returned.

separator

VARCHAR

Separator. Currently, only single-character separators are supported.

Example

  1. Create a Flink OpenSource SQL job by referring to Kafka and Print, enter the following job running script, and submit the job.
    When you create a job, set Flink Version to 1.15 in the Running Parameters tab. Select Save Job Log, and specify the OBS bucket for saving job logs. Change the values of the parameters in bold as needed in the following script.
    CREATE TABLE kafkaSource (
      target STRING,    
      separator  VARCHAR
    ) WITH (
      'connector' = 'kafka',
      'topic' = 'KafkaTopic',
      'properties.bootstrap.servers' = 'KafkaAddress1:KafkaPort,KafkaAddress2:KafkaPort',
      'properties.group.id' = 'GroupId',
      'scan.startup.mode' = 'latest-offset',
      'format' = 'json'
    );
    
    CREATE TABLE printSink (
      target STRING,    
      item STRING
    ) WITH (
      'connector' = 'print'
    );
    insert into printSink select target,  item from kafkaSource, lateral table(string_split(target, separator)) as T(item);
  2. Connect to the Kafka cluster and send the following test data to the Kafka topic:
    {"target":"test-flink","separator":"-"}
    {"target":"flink","separator":"-"}
    {"target":"one-two-ww-three","separator":"-"}

    The data is as follows:

    Table 2 Test table data

    target (STRING)

    separator (VARCHAR)

    test-flink

    -

    flink

    -

    one-two-ww-three

    -

  3. View output.
    • Method 1:
      1. Log in to the DLI console. In the navigation pane, choose Job Management > Flink Jobs.
      2. Locate the row that contains the target Flink job, and choose More > FlinkUI in the Operation column.
      3. On the Flink UI, choose Task Managers, click the task name, and select Stdout to view job logs.
    • Method 2: If you select Save Job Log on the Running Parameters tab before submitting the job, perform the following operations:
      1. Log in to the DLI console. In the navigation pane, choose Job Management > Flink Jobs.
      2. Click the name of the corresponding Flink job, choose Run Log, click OBS Bucket, and locate the folder of the log you want to view according to the date.
      3. Go to the folder of the date, find the folder whose name contains taskmanager, download the taskmanager.out file, and view result logs.
    The query result is as follows:
    +I(test-flink,test)
    +I(test-flink,flink)
    +I(flink,flink)
    +I(one-two-ww-three,one)
    +I(one-two-ww-three,two)
    +I(one-two-ww-three,ww)
    +I(one-two-ww-three,three)

    The output data is as follows:

    Table 3 Result table data

    target (STRING)

    item (STRING)

    test-flink

    test

    test-flink

    flink

    flink

    flink

    one-two-ww-three

    one

    one-two-ww-three

    two

    one-two-ww-three

    ww

    one-two-ww-three

    three