- What's New
- Function Overview
- Product Bulletin
- Service Overview
-
Billing
- Billing Overview
- Billing for Compute Resources
- Billing for Storage Resources
- Billing for Scanned Data
- Package Billing
- Billing Examples
- Renewing Subscriptions
- Bills
- Arrears
- Billing Termination
-
Billing FAQ
- What Billing Modes Does DLI Offer?
- When Is a Data Lake Queue Idle?
- How Do I Troubleshoot DLI Billing Issues?
- Why Am I Still Being Billed on a Pay-per-Use Basis After I Purchased a Package?
- How Do I View the Usage of a Package?
- How Do I View a Job's Scanned Data Volume?
- Would a Pay-Per-Use Elastic Resource Pool Not Be Billed if No Job Is Submitted for Execution?
- Do I Need to Pay Extra Fees for Purchasing a Queue Billed Based on the Scanned Data Volume?
- How Is the Usage Beyond the Package Limit Billed?
- What Are the Actual CUs, CU Range, and Specifications of an Elastic Resource Pool?
- Change History
- Getting Started
-
User Guide
- DLI Job Development Process
- Preparations
-
Creating an Elastic Resource Pool and Queues Within It
- Overview of DLI Elastic Resource Pools and Queues
- Creating an Elastic Resource Pool and Creating Queues Within It
- Managing Elastic Resource Pools
-
Managing Queues
- Viewing Basic Information About a Queue
- Queue Permission Management
- Allocating a Queue to an Enterprise Project
- Creating an SMN Topic
- Managing Queue Tags
- Setting Queue Properties
- Testing Address Connectivity
- Deleting a Queue
- Auto Scaling of Standard Queues
- Setting a Scheduled Auto Scaling Task for a Standard Queue
- Changing the CIDR Block for a Standard Queue
- Example Use Case: Creating an Elastic Resource Pool and Running Jobs
- Example Use Case: Configuring Scaling Policies for Queues in an Elastic Resource Pool
- Creating a Non-Elastic Resource Pool Queue (Discarded and Not Recommended)
- Creating Databases and Tables
-
Data Migration and Transmission
- Overview
-
Migrating Data from External Data Sources to DLI
- Overview of Data Migration Scenarios
- Using CDM to Migrate Data to DLI
- Example Typical Scenario: Migrating Data from Hive to DLI
- Example Typical Scenario: Migrating Data from Kafka to DLI
- Example Typical Scenario: Migrating Data from Elasticsearch to DLI
- Example Typical Scenario: Migrating Data from RDS to DLI
- Example Typical Scenario: Migrating Data from GaussDB(DWS) to DLI
-
Configuring DLI to Read and Write Data from and to External Data Sources
- Configuring DLI to Read and Write External Data Sources
- Configuring the Network Connection Between DLI and Data Sources (Enhanced Datasource Connection)
- Using DEW to Manage Access Credentials for Data Sources
- Using DLI Datasource Authentication to Manage Access Credentials for Data Sources
-
Managing Enhanced Datasource Connections
- Viewing Basic Information About an Enhanced Datasource Connection
- Enhanced Connection Permission Management
- Binding an Enhanced Datasource Connection to an Elastic Resource Pool
- Unbinding an Enhanced Datasource Connection from an Elastic Resource Pool
- Adding a Route for an Enhanced Datasource Connection
- Deleting the Route for an Enhanced Datasource Connection
- Modifying Host Information in an Elastic Resource Pool
- Enhanced Datasource Connection Tag Management
- Deleting an Enhanced Datasource Connection
- Example Typical Scenario: Connecting DLI to a Data Source on a Private Network
- Example Typical Scenario: Connecting DLI to a Data Source on a Public Network
- Configuring an Agency to Allow DLI to Access Other Cloud Services
- Submitting a SQL Job Using DLI
- Submitting a Flink Job Using DLI
- Submitting a Spark Job Using DLI
- Submitting a DLI Job Using a Notebook Instance
- Using Cloud Eye to Monitor DLI
- Using CTS to Audit DLI
- Permissions Management
- Common DLI Management Operations
-
Best Practices
- Overview
- Analyzing Driving Behavior Data in IoV Scenarios Using DLI
- Converting Data Format from CSV to Parquet
- Analyzing E-Commerce BI Reports Using DLI
- Analyzing Billing Consumption Data Using DLI
- Analyzing Real-time E-Commerce Business Data Using DLI
-
Connecting BI Tools to DLI for Data Analysis
- Overview
- Connecting DBeaver to DLI for Data Analysis and Query
- Connecting DBT to DLI for Data Scheduling and Analysis
- Connecting Yonghong BI to DLI for Data Query and Analysis
- Connecting Power BI to DLI Using Kyuubi for Data Analysis and Query
- Connecting FineBI to DLI Using Kyuubi for Data Analysis and Query
- Connecting Superset to DLI Using Kyuubi for Data Analysis and Query
- Connecting Tableau to DLI Using Kyuubi for Data Analysis and Query
- Connecting Beeline to DLI Using Kyuubi for Data Analysis and Query
-
Developer Guide
- Connecting to DLI Using a Client
- SQL Jobs
-
Flink Jobs
- Stream Ecosystem
-
Flink OpenSource SQL Jobs
- Reading Data from Kafka and Writing Data to RDS
- Reading Data from Kafka and Writing Data to GaussDB(DWS)
- Reading Data from Kafka and Writing Data to Elasticsearch
- Reading Data from MySQL CDC and Writing Data to GaussDB(DWS)
- Reading Data from PostgreSQL CDC and Writing Data to GaussDB(DWS)
- Configuring High-Reliability Flink Jobs (Automatic Restart upon Exceptions)
- Flink Jar Job Examples
- Writing Data to OBS Using Flink Jar
- Using Flink Jar to Connect to Kafka that Uses SASL_SSL Authentication
- Using Flink Jar to Read and Write Data from and to DIS
- Flink Job Agencies
-
Spark Jar Jobs
- Using Spark Jar Jobs to Read and Query OBS Data
- Using the Spark Job to Access DLI Metadata
- Using Spark Jobs to Access Data Sources of Datasource Connections
- Spark Jar Jobs Using DEW to Acquire Access Credentials for Reading and Writing Data from and to OBS
- Obtaining Temporary Credentials from a Spark Job's Agency for Accessing Other Cloud Services
-
SQL Syntax Reference
-
Spark SQL Syntax Reference
- Common Configuration Items
- Spark SQL Syntax
- Spark Open Source Commands
- Databases
-
Tables
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Viewing a Table
- Modifying a Table
-
Partition-related Syntax
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Backing Up and Restoring Data of Multiple Versions
- Table Lifecycle Management
- Data
- Exporting Query Results
-
Datasource Connections
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS Table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Viewing the Execution Plan
- Data Permissions
- Data Types
- User-Defined Functions
-
Built-In Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- SELECT
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- class_name
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- hdfs_path
- input_expression
- input_format_classname
- jar_path
- join_condition
- non_equi_join_condition
- number
- num_buckets
- output_format_classname
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- row_format
- select_statement
- separator
- serde_name
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- view_name
- view_properties
- when_expression
- where_condition
- window_function
- Operators
-
Flink SQL Syntax Reference
-
Flink OpenSource SQL 1.15 Syntax Reference
- Constraints and Definitions
- Overview
- Flink OpenSource SQL 1.15 Usage
- Formats
- Connectors
- DML Snytax
-
Functions
- UDFs
- Type Inference
- Parameter Transfer
-
Built-In Functions
- Comparison Functions
- Logical Functions
- Arithmetic Functions
- String Functions
- Temporal Functions
- Conditional Functions
- Type Conversion Functions
- Collection Functions
- JSON Functions
- Value Construction Functions
- Value Retrieval Functions
- Grouping Functions
- Hash Functions
- Aggregate Functions
- Table-Valued Functions
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Flink OpenSource SQL 1.15 Syntax Reference
-
HetuEngine SQL Syntax Reference
-
HetuEngine SQL Syntax
- Before You Start
- Data Type
-
DDL Syntax
- CREATE SCHEMA
- CREATE TABLE
- CREATE TABLE AS
- CREATE TABLE LIKE
- CREATE VIEW
- ALTER TABLE
- ALTER VIEW
- ALTER SCHEMA
- DROP SCHEMA
- DROP TABLE
- DROP VIEW
- TRUNCATE TABLE
- COMMENT
- VALUES
- SHOW Syntax Overview
- SHOW SCHEMAS (DATABASES)
- SHOW TABLES
- SHOW TBLPROPERTIES TABLE|VIEW
- SHOW TABLE/PARTITION EXTENDED
- SHOW FUNCTIONS
- SHOW PARTITIONS
- SHOW COLUMNS
- SHOW CREATE TABLE
- SHOW VIEWS
- SHOW CREATE VIEW
- DML Syntax
- DQL Syntax
- Auxiliary Command Syntax
- Reserved Keywords
-
SQL Functions and Operators
- Logical Operators
- Comparison Functions and Operators
- Condition Expression
- Lambda Expression
- Conversion Functions
- Mathematical Functions and Operators
- Bitwise Functions
- Decimal Functions and Operators
- String Functions and Operators
- Regular Expressions
- Binary Functions and Operators
- JSON Functions and Operators
- Date and Time Functions and Operators
- Aggregate Functions
- Window Functions
- Array Functions and Operators
- Map Functions and Operators
- URL Function
- UUID Function
- Color Function
- Teradata Function
- Data Masking Functions
- IP Address Functions
- Quantile Digest Functions
- T-Digest Functions
- Implicit Data Type Conversion
- Appendix
-
HetuEngine SQL Syntax
-
Hudi SQL Syntax Reference
- Hudi Table Overview
- DLI Hudi Metadata
- DLI Hudi Development Specifications
- Using Hudi to Develop Jobs in DLI
- DLI Hudi SQL Syntax Reference
- Spark DataSource API Syntax Reference
- Data Management and Maintenance
- Typical Hudi Configuration Parameters
- Delta SQL Syntax Reference
-
Spark SQL Syntax Reference
-
API Reference
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Global Variable-related APIs
- APIs Related to Resource Tags
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Listing Enhanced Datasource Connections
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Creating a Route
- Deleting a Route
- Datasource Authentication-related APIs
-
APIs Related to Elastic Resource Pools
- Creating an Elastic Resource Pool
- Querying All Elastic Resource Pools
- Deleting an Elastic Resource Pool
- Modifying Elastic Resource Pool Information
- Querying All Queues in an Elastic Resource Pool
- Associating a Queue with an Elastic Resource Pool
- Viewing Scaling History of an Elastic Resource Pool
- Modifying the Scaling Policy of a Queue Associated with an Elastic Resource Pool
- Queue-related APIs (Recommended)
- SQL Job-related APIs
- SQL Template-related APIs
-
Flink Job-related APIs
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Listing Jobs
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- Generating a Static Stream Graph for a Flink SQL Job
- APIs Related to Flink Job Templates
- Flink Job Management APIs
- Spark Job-related APIs
- APIs Related to Spark Job Templates
- Permissions Policies and Supported Actions
-
Out-of-Date APIs
- Agency-related APIs (Discarded)
-
Package Group-related APIs (Discarded)
- Uploading a Package Group (Discarded)
- Listing Package Groups (Discarded)
- Uploading a JAR Package Group (Discarded)
- Uploading a PyFile Package Group (Discarded)
- Uploading a File Package Group (Discarded)
- Querying Resource Packages in a Group (Discarded)
- Deleting a Resource Package from a Group (Discarded)
- Changing the Owner of a Group or Resource Package (Discarded)
- APIs Related to Spark Batch Processing (Discarded)
- SQL Job-related APIs (Discarded)
- Resource-related APIs (Discarded)
- Permission-related APIs (Discarded)
- Queue-related APIs (Discarded)
- Datasource Authentication-related APIs (Discarded)
- APIs Related to Enhanced Datasource Connections (Discarded)
- Template-related APIs (Discarded)
- Table-related APIs (Discarded)
- APIs Related to SQL Jobs (Discarded)
- APIs Related to Data Upload (Discarded)
- Cluster-related APIs
- APIs Related to Flink Jobs (Discarded)
- Public Parameters
- SDK Reference
-
FAQs
-
DLI Basics
- What Are the Differences Between DLI Flink and MRS Flink?
- What Are the Differences Between MRS Spark and DLI Spark?
- How Do I Upgrade the Engine Version of a DLI Job?
- Where Can Data Be Stored in DLI?
- Can I Import OBS Bucket Data Shared by Other Tenants into DLI?
- Regions and AZs
- Can a Member Account Use Global Variables Created by Other Member Accounts?
- Is DLI Affected by the Apache Spark Command Injection Vulnerability (CVE-2022-33891)?
- How Do I Manage Jobs Running on DLI?
- How Do I Change the Field Names of an Existing Table on DLI?
-
DLI Elastic Resource Pools and Queues
- How Can I Check the Actual and Used CUs for an Elastic Resource Pool as Well as the Required CUs for a Job?
- How Do I Check for a Backlog of Jobs in the Current DLI Queue?
- How Do I View the Load of a DLI Queue?
- How Do I Monitor Job Exceptions on a DLI Queue?
- How Do I Migrate an Old Version Spark Queue to a General-Purpose Queue?
- How Do I Do If I Encounter a Timeout Exception When Executing DLI SQL Statements on the default Queue?
-
DLI Databases and Tables
- Why Am I Unable to Query a Table on the DLI Console?
- How Do I Do If the Compression Rate of an OBS Table Is High?
- How Do I Do If Inconsistent Character Encoding Leads to Garbled Characters?
- Do I Need to to Regrant Permissions to Users and Projects After Deleting and Recreating a Table With the Same Name?
- How Do I Do If Files Imported Into a DLI Partitioned Table Lack Data for the Partition Columns, Causing Query Failures After the Import Is Completed?
- How Do I Fix Incorrect Data in an OBS Foreign Table Caused by Newline Characters in OBS File Fields?
- How Do I Prevent a Cartesian Product Query and Resource Overload Due to Missing "ON" Conditions in Table Joins?
- How Do I Do If I Can't Query Data After Manually Adding It to the Partition Directory of an OBS Table?
- Why Does the "insert overwrite" Operation Affect All Data in a Partitioned Table Instead of Just the Targeted Partition?
- Why Does the "create_date" Field in an RDS Table (Datetime Data Type) Appear as a Timestamp in DLI Queries?
- How Do I Do If Renaming a Table After a SQL Job Causes Incorrect Data Size?
- How Can I Resolve Data Inconsistencies When Importing Data from DLI to OBS?
-
Enhanced Datasource Connections
- How Do I Do If I Can't Bind an Enhanced Datasource Connection to a Queue?
- How Do I Resolve a Failure in Connecting DLI to GaussDB(DWS) Through an Enhanced Datasource Connection?
- How Do I Do If the Datasource Connection Is Successfully Created but the Network Connectivity Test Fails?
- How Do I Configure Network Connectivity Between a DLI Queue and a Data Source?
- Why Is Creating a VPC Peering Connection Necessary for Enhanced Datasource Connections in DLI?
- How Do I Do If Creating a Datasource Connection in DLI Gets Stuck in the "Creating" State When Binding It to a Queue?
- How Do I Resolve the "communication link failure" Error When Using a Newly Created Datasource Connection That Appears to Be Activated?
- How Do I Troubleshoot a Connection Timeout Issue That Isn't Recorded in Logs When Accessing MRS HBase Through a Datasource Connection?
- How Do I Fix the "Failed to get subnet" Error When Creating a Datasource Connection in DLI?
- How Do I Do If I Encounter the "Incorrect string value" Error When Executing insert overwrite on a Datasource RDS Table?
- How Do I Resolve the Null Pointer Error When Creating an RDS Datasource Table?
- Error Message "org.postgresql.util.PSQLException: ERROR: tuple concurrently updated" Is Displayed When the System Executes insert overwrite on a Datasource GaussDB(DWS) Table
- RegionTooBusyException Is Reported When Data Is Imported to a CloudTable HBase Table Through a Datasource Table
- How Do I Do If A Null Value Is Written Into a Non-Null Field When Using a DLI Datasource Connection to Connect to a GaussDB(DWS) Table?
- How Do I Do If an Insert Operation Failed After the Schema of the GaussDB(DWS) Source Table Is Updated?
- How Do I Insert Data into an RDS Table with an Auto-Increment Primary Key Using DLI?
-
SQL Jobs
-
SQL Job Development
- SQL Jobs
- How Do I Merge Small Files?
- How Do I Use DLI to Access Data in an OBS Bucket?
- How Do I Specify an OBS Path When Creating an OBS Table?
- How Do I Create a Table Using JSON Data in an OBS Bucket?
- How Can I Use the count Function to Perform Aggregation?
- How Do I Synchronize DLI Table Data Across Regions?
- How Do I Insert Table Data into Specific Fields of a Table Using a SQL Job?
- How Do I Troubleshoot Slow SQL Jobs?
- How Do I View DLI SQL Logs?
- How Do I View SQL Execution Records in DLI?
- How Do I Do When Data Skew Occurs During the Execution of a SQL Job?
- Why Does a SQL Job That Has Join Operations Stay in the Running State?
- Why Is a SQL Job Stuck in the Submitting State?
-
SQL Job O&M
- Why Is Error "path obs://xxx already exists" Reported When Data Is Exported to OBS?
- Why Is Error "SQL_ANALYSIS_ERROR: Reference 't.id' is ambiguous, could be: t.id, t.id.;" Displayed When Two Tables Are Joined?
- Why Is Error "The current account does not have permission to perform this operation,the current account was restricted. Restricted for no budget." Reported when a SQL Statement Is Executed?
- Why Is Error "There should be at least one partition pruning predicate on partitioned table XX.YYY" Reported When a Query Statement Is Executed?
- Why Is Error "IllegalArgumentException: Buffer size too small. size" Reported When Data Is Loaded to an OBS Foreign Table?
- Why Is Error "DLI.0002 FileNotFoundException" Reported During SQL Job Running?
- Why Is a Schema Parsing Error Reported When I Create a Hive Table Using CTAS?
- Why Is Error "org.apache.hadoop.fs.obs.OBSIOException" Reported When I Run DLI SQL Scripts on DataArts Studio?
- Why Is Error "UQUERY_CONNECTOR_0001:Invoke DLI service api failed" Reported in the Job Log When I Use CDM to Migrate Data to DLI?
- Why Is Error "File not Found" Reported When I Access a SQL Job?
- Why Is Error "DLI.0003: AccessControlException XXX" Reported When I Access a SQL Job?
- Why Is Error "DLI.0001: org.apache.hadoop.security.AccessControlException: verifyBucketExists on {{bucket name}}: status [403]" Reported When I Access a SQL Job?
- Why Am I Seeing the Error Message "The current account does not have permission to perform this operation,the current account was restricted. Restricted for no budget" When Executing a SQL Statement?
-
SQL Job Development
-
Flink Jobs
-
Flink Job Consulting
- How Do I Authorize a Subuser to View Flink Jobs?
- How Do I Configure Auto Restart upon Exception for a Flink Job?
- How Do I Save Logs for Flink Jobs?
- Why Is Error "No such user. userName:xxxx." Reported on the Flink Job Management Page When I Grant Permission to a User?
- How Do I Restore a Flink Job from a Specific Checkpoint After Manually Stopping the Job?
- Why Is a Message Displayed Indicating That the SMN Topic Does Not Exist When I Use the SMN Topic in DLI?
-
Flink SQL Jobs
- How Do I Map an OBS Table to a DLI Partitioned Table?
- How Do I Change the Number of Kafka Partitions in a Flink SQL Job Without Stopping It?
- How Do I Fix the DLI.0005 Error When Using EL Expressions to Create a Table in a Flink SQL Job?
- Why Is No Data Queried in the DLI Table Created Using the OBS File Path When Data Is Written to OBS by a Flink Job Output Stream?
- Why Does a Flink SQL Job Fails to Be Executed, and Is "connect to DIS failed java.lang.IllegalArgumentException: Access key cannot be null" Displayed in the Log?
- Data Writing Fails After a Flink SQL Job Consumed Kafka and Sank Data to the Elasticsearch Cluster
- How Does Flink Opensource SQL Parse Nested JSON?
- Why Is the Time Read by a Flink OpenSource SQL Job from the RDS Database Is Different from the RDS Database Time?
- Why Does Job Submission Fail When the failure-handler Parameter of the Elasticsearch Result Table for a Flink Opensource SQL Job Is Set to retry_rejected?
- How Do I Configure Connection Retries for Kafka Sink If it is Disconnected?
- How Do I Write Data to Different Elasticsearch Clusters in a Flink Job?
- Why Does DIS Stream Not Exist During Job Semantic Check?
- Why Is Error "Timeout expired while fetching topic metadata" Repeatedly Reported in Flink JobManager Logs?
-
Flink Jar Jobs
- Can I Upload Configuration Files for Flink Jar Jobs?
- Why Does a Flink Jar Package Conflict Result in Job Submission Failure?
- Why Does a Flink Jar Job Fail to Access GaussDB(DWS) and a Message Is Displayed Indicating Too Many Client Connections?
- Why Is Error Message "Authentication failed" Displayed During Flink Jar Job Running?
- Why Is Error Invalid OBS Bucket Name Reported After a Flink Job Submission Failed?
- Why Does the Flink Submission Fail Due to Hadoop JAR File Conflict?
- How Do I Locate a Flink Job Submission Error?
-
Flink Job Performance Tuning
- What Is the Recommended Configuration for a Flink Job?
- Flink Job Performance Tuning
- How Do I Prevent Data Loss After Flink Job Restart?
- How Do I Locate a Flink Job Running Error?
- How Can I Check if a Flink Job Can Be Restored From a Checkpoint After Restarting It?
- Why Are Logs Not Written to the OBS Bucket After a DLI Flink Job Fails to Be Submitted for Running?
- Why Is the Flink Job Abnormal Due to Heartbeat Timeout Between JobManager and TaskManager?
-
Flink Job Consulting
-
Spark Jobs
-
Spark Job Development
- Spark Jobs
- How Do I Use Spark to Write Data into a DLI Table?
- How Do I Set Up AK/SK So That a General Queue Can Access Tables Stored in OBS?
- How Do I View the Resource Usage of DLI Spark Jobs?
- How Do I Use Python Scripts to Access the MySQL Database If the pymysql Module Is Missing from the Spark Job Results Stored in MySQL?
- How Do I Run a Complex PySpark Program in DLI?
- How Do I Use JDBC to Set the spark.sql.shuffle.partitions Parameter to Improve the Task Concurrency?
- How Do I Read Uploaded Files for a Spark Jar Job?
- Why Can't I Find the Specified Python Environment After Adding the Python Package?
- Why Is a Spark Jar Job Stuck in the Submitting State?
-
Spark Job O&M
- What Can I Do When Receiving java.lang.AbstractMethodError in the Spark Job?
- Why Do I Get "ResponseCode: 403" and "ResponseStatus: Forbidden" Errors When a Spark Job Accesses OBS Data?
- Why Do I Encounter the Error "verifyBucketExists on XXXX: status [403]" When Using a Spark Job to Access an OBS Bucket That I Have Permission to Access?
- Why Does a Job Running Timeout Occur When Processing a Large Amount of Data with a Spark Job?
- Why Does a Spark Job Fail to Execute with an Abnormal Access Directory Error When Accessing Files in SFTP?
- Why Does the Job Fail to Be Executed Due to Insufficient Database and Table Permissions?
- Why Is the global_temp Database Missing in the Job Log of Spark 3.x?
- Why Does Using DataSource Syntax to Create an OBS Table of Avro Type Fail When Accessing Metadata With Spark 2.3.x?
-
Spark Job Development
- DLI Resource Quotas
-
DLI Permissions Management
- How Do I Do If I Receive an Error Message Stating That I Do Not Have Sufficient Permissions When Creating a Table After Upgrading the Engine Version of a Queue?
- What Is Column-Level Authorization for DLI Partitioned Tables?
- How Do I Do If I Encounter Insufficient Permissions While Updating Packages?
- Why Is Error "DLI.0003: Permission denied for resource..." Reported When I Run a SQL Statement?
- How Do I Do If I Can't Query Table Data After Being Granted Table Permissions?
- Will Granting Duplicate Permissions to a Table After Inheriting Database Permissions Cause an Error?
- Why Can't I Query a View After I'm Granted the Select Table Permission on the View?
- How Do I Do If I Receive a Message Saying I Don't Have Sufficient Permissions to Submit My Jobs to the Job Bucket?
- How Do I Resolve an Unauthorized OBS Bucket Error?
- DLI APIs
-
DLI Basics
-
More Documents
-
User Guide (ME-Abu Dhabi Region)
- DLI Introduction
- Getting Started
- DLI Console Overview
- SQL Editor
- Job Management
- Queue Management
- Data Management
- Job Templates
- Datasource Connections
- Global Configuration
- UDFs
- Permissions Management
- Change History
-
API Reference (ME-Abu Dhabi Region)
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Queue-related APIs (Recommended)
- APIs Related to SQL Jobs
- Package Group-related APIs
-
APIs Related to Flink Jobs
- Granting OBS Permissions to DLI
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Querying the Job List
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- APIs Related to Spark jobs
- APIs Related to Flink Job Templates
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Querying an Enhanced Datasource Connection List
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying the Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Global Variable-related APIs
- Public Parameters
- Change History
-
SQL Syntax Reference (ME-Abu Dhabi Region)
-
Spark SQL Syntax Reference
- Common Configuration Items of Batch SQL Jobs
- SQL Syntax Overview of Batch Jobs
- Spark Open Source Commands
- Databases
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Checking Tables
- Modifying a Table
-
Syntax for Partitioning a Table
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Importing Data to the Table
- Inserting Data
- Clearing Data
- Exporting Search Results
- Backing Up and Restoring Data of Multiple Versions
- Table Lifecycle Management
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Checking the Execution Plan
- Data Permissions Management
- Data Types
- User-Defined Functions
-
Built-in Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- Basic SELECT Statements
- Filtering
- Sorting
- Grouping
- JOIN
- Subquery
- Alias
- Set Operations
- WITH...AS
- CASE...WHEN
- OVER Clause
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Historical Versions
-
Flink SQL Syntax
- SQL Syntax Constraints and Definitions
- SQL Syntax Overview of Stream Jobs
- Creating a Source Stream
-
Creating a Sink Stream
- MRS OpenTSDB Sink Stream
- CSS Elasticsearch Sink Stream
- DCS Sink Stream
- DDS Sink Stream
- DIS Sink Stream
- DMS Sink Stream
- DWS Sink Stream (JDBC Mode)
- DWS Sink Stream (OBS-based Dumping)
- MRS HBase Sink Stream
- MRS Kafka Sink Stream
- Open-Source Kafka Sink Stream
- File System Sink Stream (Recommended)
- OBS Sink Stream
- RDS Sink Stream
- SMN Sink Stream
- Creating a Temporary Stream
- Creating a Dimension Table
- Custom Stream Ecosystem
- Data Type
- Built-In Functions
- User-Defined Functions
- Geographical Functions
- SELECT
- Condition Expression
- Window
- JOIN Between Stream Data and Table Data
- Configuring Time Models
- Pattern Matching
- StreamingML
- Reserved Keywords
-
Flink SQL Syntax
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- input_expression
- join_condition
- non_equi_join_condition
- number
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- select_statement
- separator
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- when_expression
- where_condition
- window_function
- Operators
-
Spark SQL Syntax Reference
- User Guide (Paris Region)
-
API Reference (Paris Region)
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Queue-related APIs (Recommended)
- APIs Related to SQL Jobs
- Package Group-related APIs
-
APIs Related to Flink Jobs
- Granting OBS Permissions to DLI
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Querying the Job List
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- APIs Related to Spark jobs
- APIs Related to Flink Job Templates
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Querying an Enhanced Datasource Connection List
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying the Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Global Variable-related APIs
- Public Parameters
- Change History
-
SQL Syntax Reference (Paris Region)
-
Spark SQL Syntax Reference
- Common Configuration Items of Batch SQL Jobs
- SQL Syntax Overview of Batch Jobs
- Spark Open Source Commands
- Databases
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Checking Tables
- Modifying a Table
-
Syntax for Partitioning a Table
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Importing Data to the Table
- Inserting Data
- Clearing Data
- Exporting Search Results
- Table Lifecycle Management
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Checking the Execution Plan
- Data Permissions Management
- Data Types
- User-Defined Functions
-
Built-in Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- Basic SELECT Statements
- Filtering
- Sorting
- Grouping
- JOIN
- Subquery
- Alias
- Set Operations
- WITH...AS
- CASE...WHEN
- OVER Clause
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Historical Versions
-
Flink SQL Syntax
- SQL Syntax Constraints and Definitions
- SQL Syntax Overview of Stream Jobs
- Creating a Source Stream
-
Creating a Sink Stream
- MRS OpenTSDB Sink Stream
- CSS Elasticsearch Sink Stream
- DCS Sink Stream
- DDS Sink Stream
- DIS Sink Stream
- DMS Sink Stream
- DWS Sink Stream (JDBC Mode)
- DWS Sink Stream (OBS-based Dumping)
- MRS HBase Sink Stream
- MRS Kafka Sink Stream
- Open-Source Kafka Sink Stream
- File System Sink Stream (Recommended)
- OBS Sink Stream
- RDS Sink Stream
- SMN Sink Stream
- Creating a Temporary Stream
- Creating a Dimension Table
- Custom Stream Ecosystem
- Data Type
- Built-In Functions
- User-Defined Functions
- Geographical Functions
- SELECT
- Condition Expression
- Window
- JOIN Between Stream Data and Table Data
- Configuring Time Models
- Pattern Matching
- StreamingML
- Reserved Keywords
-
Flink SQL Syntax
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- input_expression
- join_condition
- non_equi_join_condition
- number
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- select_statement
- separator
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- when_expression
- where_condition
- window_function
- Operators
-
Spark SQL Syntax Reference
-
User Guide (Kuala Lumpur Region)
- DLI Introduction
- Getting Started
- DLI Console Overview
- SQL Editor
- Job Management
- Queue Management
- Data Management
- Job Templates
- Datasource Connections
- Global Configuration
- UDFs
- Permissions Management
- Change History
-
API Reference (Kuala Lumpur Region)
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Queue-related APIs (Recommended)
- APIs Related to SQL Jobs
- Package Group-related APIs
-
APIs Related to Flink Jobs
- Granting OBS Permissions to DLI
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Querying the Job List
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- APIs Related to Spark jobs
- APIs Related to Flink Job Templates
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Querying an Enhanced Datasource Connection List
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying the Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Global Variable-related APIs
- Permissions Policies and Supported Actions
- Public Parameters
- Change History
-
SQL Syntax Reference (Kuala Lumpur Region)
-
Spark SQL Syntax Reference
- Common Configuration Items of Batch SQL Jobs
- SQL Syntax Overview of Batch Jobs
- Spark Open Source Commands
- Databases
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Checking Tables
- Modifying a Table
-
Syntax for Partitioning a Table
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Importing Data to the Table
- Inserting Data
- Clearing Data
- Exporting Search Results
- Backing Up and Restoring Data of Multiple Versions
- Table Lifecycle Management
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Checking the Execution Plan
- Data Permissions Management
- Data Types
- User-Defined Functions
-
Built-in Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- Basic SELECT Statements
- Filtering
- Sorting
- Grouping
- JOIN
- Subquery
- Alias
- Set Operations
- WITH...AS
- CASE...WHEN
- OVER Clause
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Historical Versions
-
Flink SQL Syntax
- SQL Syntax Constraints and Definitions
- SQL Syntax Overview of Stream Jobs
- Creating a Source Stream
-
Creating a Sink Stream
- CloudTable HBase Sink Stream
- CloudTable OpenTSDB Sink Stream
- MRS OpenTSDB Sink Stream
- CSS Elasticsearch Sink Stream
- DCS Sink Stream
- DDS Sink Stream
- DIS Sink Stream
- DMS Sink Stream
- DWS Sink Stream (JDBC Mode)
- DWS Sink Stream (OBS-based Dumping)
- MRS HBase Sink Stream
- MRS Kafka Sink Stream
- Open-Source Kafka Sink Stream
- File System Sink Stream (Recommended)
- OBS Sink Stream
- RDS Sink Stream
- SMN Sink Stream
- Creating a Temporary Stream
- Creating a Dimension Table
- Custom Stream Ecosystem
- Data Type
- Built-In Functions
- User-Defined Functions
- Geographical Functions
- SELECT
- Condition Expression
- Window
- JOIN Between Stream Data and Table Data
- Configuring Time Models
- Pattern Matching
- StreamingML
- Reserved Keywords
-
Flink SQL Syntax
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- input_expression
- join_condition
- non_equi_join_condition
- number
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- select_statement
- separator
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- when_expression
- where_condition
- window_function
- Operators
-
Spark SQL Syntax Reference
-
User Guide (ME-Abu Dhabi Region)
- Videos
- General Reference
Copied.
Creating a Flink OpenSource SQL Job
This section describes how to create a Flink OpenSource SQL job.
DLI Flink OpenSource SQL jobs are fully compatible with the syntax of Flink provided by the community. In addition, Redis and GaussDB(DWS) data source types are added based on the community connector. For the syntax and constraints of Flink SQL DDL, DML, and functions, see Table API & SQL.
- For the Flink OpenSource SQL 1.15 syntax, see Flink OpenSource SQL 1.15 Syntax.
- For the Flink OpenSource SQL 1.12 syntax, see Flink OpenSource SQL 1.12 Syntax.
Prerequisites
- You have prepared the source and sink streams.
- A datasource connection has been created to enable the network between the queue where the job is about to run and external data sources.
- For details about the external data sources that can be accessed by Flink jobs, see Common Development Methods for DLI Cross-Source Analysis.
- For how to create a datasource connection, see Configuring the Network Connection Between DLI and Data Sources (Enhanced Datasource Connection).
On the Resources > Queue Management page, locate the queue you have created, click More in the Operation column, and select Test Address Connectivity to check if the network connection between the queue and the data source is normal. For details, see Testing Address Connectivity.
Precautions
Before creating jobs and submitting tasks, you are advised to enable CTS to record DLI operations for queries, audits, and tracking. Using CTS to Audit DLI lists DLI operations that can be recorded by CTS.
For how to enable CTS and view trace details, see the Cloud Trace Service Getting Started.
Creating a Flink OpenSource SQL Job
- In the left navigation pane of the DLI management console, choose Job Management > Flink Jobs. The Flink Jobs page is displayed.
- In the upper right corner of the Flink Jobs page, click Create Job.
Figure 1 Creating a Flink OpenSource SQL job
- Set job parameters.
Table 1 Job parameters Parameter
Description
Type
Set Type to Flink OpenSource SQL. You will need to start jobs by compiling SQL statements.
Name
Job name. Enter 1 to 57 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed.
NOTE:
The job name must be globally unique.
Description
Description of a job. It can contain a maximum of 512 characters.
Template Name
You can select a sample template or a custom job template. For details about templates, see Managing Flink Job Templates.
Tags
Tags used to identify cloud resources. A tag includes the tag key and tag value. If you want to use the same tag to identify multiple cloud resources, that is, to select the same tag from the drop-down list box for all services, you are advised to create predefined tags on the Tag Management Service (TMS).
If your organization has configured tag policies for DLI, add tags to resources based on the policies. If a tag does not comply with the tag policies, resource creation may fail. Contact your organization administrator to learn more about tag policies.
For details, see Tag Management Service User Guide.
NOTE:
- A maximum of 20 tags can be added.
- Only one tag value can be added to a tag key.
- The key name in each resource must be unique.
- Tag key: Enter a tag key name in the text box.
NOTE:
A tag key can contain a maximum of 128 characters. Only letters, numbers, spaces, and special characters (_.:+-@) are allowed, but the value cannot start or end with a space or start with _sys_.
- Tag value: Enter a tag value in the text box.
NOTE:
A tag value can contain a maximum of 255 characters. Only letters, numbers, spaces, and special characters (_.:+-@) are allowed.
- Click OK to enter the editing page.
- Edit an OpenSource SQL job.
Enter detailed SQL statements in the statement editing area. For details about SQL statements, see the Data Lake Insight Flink OpenSource SQL Syntax Reference.
- Click Check Semantics.
- You can Start a job only after the semantic verification is successful.
- If verification is successful, the message "The SQL semantic verification is complete. No error." will be displayed.
- If verification fails, a red "X" mark will be displayed in front of each SQL statement that produced an error. You can move the cursor to the "X" mark to view error details and change the SQL statement as prompted.
NOTE:
Flink 1.15 does not support syntax verification.
- Set job running parameters.
Figure 2 Setting running parameters for Flink OpenSource SQL
Table 2 Running parameters Parameter
Description
Queue
Select a queue to run the job.
UDF Jar
UDF JAR file, which contains UDFs that can be called in subsequent jobs.
There are the following ways to manage UDF JAR files:
- Upload packages to OBS: Upload Jar packages to an OBS bucket in advance and select the corresponding OBS path.
- Upload packages to DLI: Upload JAR files to an OBS bucket in advance and create a package on the Data Management > Package Management page of the DLI management console. For details, see Creating a DLI Package.
For Flink 1.15 or later, only OBS packages can be selected when creating jobs, and DLI packages are not supported.
Flink Version
Flink version used for job running. Flink versions have varying feature support.
If you choose to use Flink 1.15, make sure to configure the agency information for the cloud service that DLI is allowed to access in the job.
For the syntax of Flink 1.15, see Flink OpenSource SQL 1.15 Usage and Flink OpenSource SQL 1.15 Syntax.
For the syntax of Flink 1.12, see Flink OpenSource SQL 1.12 Syntax.
NOTE:
You are advised not to use Flink of different versions for a long time.
- Doing so can lead to code incompatibility, which can negatively impact job execution efficiency.
- Doing so may result in job execution failures due to conflicts in dependencies. Jobs rely on specific versions of libraries or components.
Agency
If you choose Flink 1.15 to execute your job, you can create a custom agency to allow DLI to access other services.
CUs
Sum of the number of compute units and job manager CUs of DLI. CU is also the billing unit of DLI. One CU equals 1 vCPU and 4 GB.
The value is the number of CUs required for job running and cannot exceed the number of CUs in the bound queue.
NOTE:
When Task Manager Config is selected, elastic resource pool queue management is optimized by automatically adjusting CUs to match Actual CUs after setting Slot(s) per TM.
CUs = Actual number of CUs = max[Job Manager CPUs + Task Manager CPU, (Job Manager Memory + Task Manager Memory/4)]
- Job Manager CPUs + Task Manager CPUs = Actual TMs x CU(s) per TM + Job Manager CUs.
- Job Manager Memory + Task Manager Memory = Actual TMs x Memory per TM + Job Manager Memory
- If Slot(s) per TM is set, then: Actual TMs = Parallelism/Slot(s) per TM.
- If Slot(s) per TM is not set, then: Actual TMs = (CUs – Job Manager CUs)/CU(s) per TM.
- If Memory per TM and Job Manager Memory in the optimization parameters are not set, then: Memory per TM = CU(s) per TM x 4. Job Manager Memory = Job Manager CUs x 4.
- The parallelism degree of Spark resources is jointly determined by the number of Executors and the number of Executor CPU cores.
Job Manager CUs
Number of CUs of the management unit.
Parallelism
Number of tasks concurrently executed by each operator in a job.
NOTE:
This value cannot be greater than four times the compute units (number of CUs minus the number of job manager CUs).
Task Manager Config
Whether Task Manager resource parameters are set
- If selected, you need to set the following parameters:
- CU(s) per TM: Number of resources occupied by each Task Manager.
- Slot(s) per TM: Number of slots contained in each Task Manager.
- If not selected, the system automatically uses the default values.
- CU(s) per TM: The default value is 1.
- Slot(s) per TM: The default value is (Parallelism x CU(s) per TM)/(CUs – Job Manager CUs).
OBS Bucket
OBS bucket to store job logs and checkpoint information. If the OBS bucket you selected is unauthorized, click Authorize.
Save Job Log
Whether job running logs are saved to OBS. The logs are saved in the following path: Bucket name/jobs/logs/Directory starting with the job ID.
CAUTION:You are advised to configure this parameter. Otherwise, no run log is generated after the job is executed. If the job fails, the run log cannot be obtained for fault locating.
If this option is selected, you need to set the following parameters:
OBS Bucket: Select an OBS bucket to store job logs. If the OBS bucket you selected is unauthorized, click Authorize.NOTE:
If Enable Checkpointing and Save Job Log are both selected, you only need to authorize OBS once.
Alarm on Job Exception
Whether to notify users of any job exceptions, such as running exceptions or arrears, via SMS or email.
If this option is selected, you need to set the following parameters:
SMN Topic
Select a custom SMN topic. For how to create a custom SMN topic, see Creating a Topic.
Enable Checkpointing
Whether to enable job snapshots. If this function is enabled, jobs can be restored based on the checkpoints.
If this option is selected, you need to set the following parameters:- Checkpoint Interval: interval for creating checkpoints, in seconds. The value ranges from 1 to 999999, and the default value is 30.
- Checkpoint Mode can be set to either of the following values:
- At least once: Events are processed at least once.
- Exactly once: Events are processed only once.
If you select Enable Checkpointing, you also need to set OBS Bucket.
OBS Bucket: Select an OBS bucket to store your checkpoints. If the OBS bucket you selected is unauthorized, click Authorize.
The checkpoint path is Bucket name/jobs/checkpoint/Directory starting with the job ID.NOTE:
If Enable Checkpointing and Save Job Log are both selected, you only need to authorize OBS once.
Auto Restart upon Exception
Whether automatic restart is enabled. If enabled, jobs will be automatically restarted and restored when exceptions occur.
If this option is selected, you need to set the following parameters:
- Max. Retry Attempts: maximum number of retries upon an exception. The unit is times/hour.
- Unlimited: The number of retries is unlimited.
- Limited: The number of retries is user-defined.
- Restore Job from Checkpoint: This parameter is available only when Enable Checkpointing is selected.
Idle State Retention Time
Clears intermediate states of operators such as GroupBy, RegularJoin, Rank, and Depulicate that have not been updated after the maximum retention time. The default value is 1 hour.
Dirty Data Policy
Policy for processing dirty data. The following policies are supported: Ignore, Trigger a job exception, and Save.
If you set this field to Save, the Dirty Data Dump Address must be set. Click the address box to select the OBS path for storing dirty data.
This parameter is available only when a DIS data source is used.
- (Optional) Set the runtime configuration as required. For details about related parameters, seeHow Do I Optimize Performance of a Flink Job?
Figure 3 Runtime configuration
- Click Save.
- Click Start. On the displayed Start Flink Jobs page, confirm the job specifications and the price, and click Start Now to start the job.
After the job is started, the system automatically switches to the Flink Jobs page, and the created job is displayed in the job list. You can view the job status in the Status column. Once a job is successfully submitted, its status changes from Submitting to Running. After the execution is complete, the status changes to Completed.
If the job status is Submission failed or Running exception, the job fails to submit or run. In this case, you can hover over the status icon in the Status column of the job list to view the error details. You can click
to copy these details. Rectify the fault based on the error information and resubmit the job.
NOTE:
Other buttons are as follows:
- Save As: Save the created job as a new job.
- Static Stream Graph: Provide the static concurrency estimation function and stream graph display function. See Figure 5.
- Simplified Stream Graph: Display the data processing flow from the source to the sink. See Figure 4.
- Format: Format the SQL statements in the editing box.
- Set as Template: Set the created SQL statements as a job template.
- Theme Settings: Set the theme related parameters, including Font Size, Wrap, and Page Style.
- Help: Redirect to the help center to provide you with the SQL syntax for stream jobs.
Simplified Stream Graph
On the OpenSource SQL job editing page, click Simplified Stream Graph.
Simplified stream graph viewing is only supported in Flink 1.12 and Flink 1.10.
Static Stream Graph
On the OpenSource SQL job editing page, click Static Stream Graph.
- Simplified stream graph viewing is only supported in Flink 1.12 and Flink 1.10.
- If you use a UDF in a Flink OpenSource SQL job, it is not possible to generate a static stream graph.
The Static Stream Graph page also allows you to:
- Estimate concurrencies. Click Estimate Concurrencies on the Static Stream Graph page to estimate concurrencies. Click Restore Initial Value to restore the initial value after concurrency estimation.
- Zoom in or out the page.
- Expand or merge operator chains.
- You can edit Parallelism, Output rate, and Rate factor.
- Parallelism: indicates the number of concurrent tasks.
- Output rate: indicates the data traffic of an operator. The unit is piece/s.
- Rate factor: indicates the retention rate after data is processed by operators. Rate factor = Data output volume of an operator/Data input volume of the operator (Unit: %)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot