How Do I Create a Flink Hive SQL Job Through DataArts Studio?

Symptom

How do I create a Flink Hive SQL job through DataArts Studio?

Solution

The following is an example of creating and submitting a Flink Hive job through DataArts Studio, which reads data from Kafka and writes data to Hive.

Create a Hive table on the Hive client before creating the job.

create table user_behavior_hive_tbl_no_partition(
    user_id STRING,
    item_id STRING,
    cat_id STRING,
    ts timestamp
  ) PARTITIONED BY (dy STRING, ho STRING, mi STRING) 
stored as textfile TBLPROPERTIES (
 'partition.time-extractor.timestamp-pattern' = '$dy $ho:$mi:00',
 'sink.partition-commit.trigger' ='process-time',
 'sink.partition-commit.delay' = '0S',
 'sink.partition-commit.policy.kind' = 'metastore,success-file'
  );

Create a Flink Hive SQL job and submit it on DataArts Studio.

CREATE TABLE test_kafka (
  user_id varchar,
  item_id varchar,
  cat_id varchar,
  zw_test timestamp
) WITH (
  'connector' = 'kafka',
  'topic' = 'zw_test_kafka',
  'format' = 'json',
  'properties.bootstrap.servers' = 'IP address of the Kafka broker instance:Kafka port',
  'properties.group.id' = 'example-group1',
  'scan.startup.mode' = 'latest-offset'
);
CREATE CATALOG myhive WITH (
  'type' = 'hive',
  'hive-version' = '3.1.0',
  'default-database' = 'default'
);
use catalog myhive;
INSERT into
  user_behavior_hive_tbl_no_partition
SELECT
  user_id,
  item_id,
  cat_id,
  zw_test,
  DATE_FORMAT(zw_test, 'yyyy-MM-dd'),
  DATE_FORMAT(zw_test, 'HH'),
  DATE_FORMAT(zw_test, 'mm')
FROM
  default_catalog.default_database.test_kafka;

In MRS 3.2.0 and earlier versions, when creating a catalog, you need to specify the Hive configuration file path in the WITH parameter, for example, hive-conf-dir'='obs://test-bucket/tmp/hive/config.

Obtain the Hive configuration file.

Log in to FusionInsight Manager, click the Homepage tab. In the upper right area, click Download Client. In the displayed dialog box, select Configuration Files Only, then select a platform type and download location, and click OK.
Decompress the client file FusionInsight_Cluster_*_Client.tar and copy all configuration files in Hive/config directory.