How Do I Create a Flink Hive SQL Job Through DataArts Studio?
Symptom
How do I create a Flink Hive SQL job through DataArts Studio?
Solution
The following is an example of creating and submitting a Flink Hive job through DataArts Studio, which reads data from Kafka and writes data to Hive.
- Create a Hive table on the Hive client before creating the job.
create table user_behavior_hive_tbl_no_partition( user_id STRING, item_id STRING, cat_id STRING, ts timestamp ) PARTITIONED BY (dy STRING, ho STRING, mi STRING) stored as textfile TBLPROPERTIES ( 'partition.time-extractor.timestamp-pattern' = '$dy $ho:$mi:00', 'sink.partition-commit.trigger' ='process-time', 'sink.partition-commit.delay' = '0S', 'sink.partition-commit.policy.kind' = 'metastore,success-file' ); - Create a Flink Hive SQL job and submit it on DataArts Studio.
CREATE TABLE test_kafka ( user_id varchar, item_id varchar, cat_id varchar, zw_test timestamp ) WITH ( 'connector' = 'kafka', 'topic' = 'zw_test_kafka', 'format' = 'json', 'properties.bootstrap.servers' = 'IP address of the Kafka broker instance:Kafka port', 'properties.group.id' = 'example-group1', 'scan.startup.mode' = 'latest-offset' ); CREATE CATALOG myhive WITH ( 'type' = 'hive', 'hive-version' = '3.1.0', 'default-database' = 'default' ); use catalog myhive; INSERT into user_behavior_hive_tbl_no_partition SELECT user_id, item_id, cat_id, zw_test, DATE_FORMAT(zw_test, 'yyyy-MM-dd'), DATE_FORMAT(zw_test, 'HH'), DATE_FORMAT(zw_test, 'mm') FROM default_catalog.default_database.test_kafka;
In MRS 3.2.0 and earlier versions, when creating a catalog, you need to specify the Hive configuration file path in the WITH parameter, for example, hive-conf-dir'='obs://test-bucket/tmp/hive/config.
Obtain the Hive configuration file.
- Log in to FusionInsight Manager, click the Homepage tab. In the upper right area, click Download Client. In the displayed dialog box, select Configuration Files Only, then select a platform type and download location, and click OK.
- Decompress the client file FusionInsight_Cluster_*_Client.tar and copy all configuration files in Hive/config directory.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.