El contenido no se encuentra disponible en el idioma seleccionado. Estamos trabajando continuamente para agregar más idiomas. Gracias por su apoyo.
- What's New
- Function Overview
- Product Bulletin
- Service Overview
-
Billing
- Billing Overview
- Billing for Compute Resources
- Billing for Storage Resources
- Billing for Scanned Data
- Package Billing
- Billing Examples
- Renewing Subscriptions
- Bills
- Arrears
- Billing Termination
-
Billing FAQ
- What Billing Modes Does DLI Offer?
- When Is a Data Lake Queue Idle?
- How Do I Troubleshoot DLI Billing Issues?
- Why Am I Still Being Billed on a Pay-per-Use Basis After I Purchased a Package?
- How Do I View the Usage of a Package?
- How Do I View a Job's Scanned Data Volume?
- Would a Pay-Per-Use Elastic Resource Pool Not Be Billed if No Job Is Submitted for Execution?
- Do I Need to Pay Extra Fees for Purchasing a Queue Billed Based on the Scanned Data Volume?
- How Is the Usage Beyond the Package Limit Billed?
- What Are the Actual CUs, CU Range, and Specifications of an Elastic Resource Pool?
- Change History
- Getting Started
-
User Guide
- DLI Job Development Process
- Preparations
-
Creating an Elastic Resource Pool and Queues Within It
- Overview of DLI Elastic Resource Pools and Queues
- Creating an Elastic Resource Pool and Creating Queues Within It
- Managing Elastic Resource Pools
-
Managing Queues
- Viewing Basic Information About a Queue
- Queue Permission Management
- Allocating a Queue to an Enterprise Project
- Creating an SMN Topic
- Managing Queue Tags
- Setting Queue Properties
- Testing Address Connectivity
- Deleting a Queue
- Auto Scaling of Standard Queues
- Setting a Scheduled Auto Scaling Task for a Standard Queue
- Changing the CIDR Block for a Standard Queue
- Example Use Case: Creating an Elastic Resource Pool and Running Jobs
- Example Use Case: Configuring Scaling Policies for Queues in an Elastic Resource Pool
- Creating a Non-Elastic Resource Pool Queue (Discarded and Not Recommended)
- Creating Databases and Tables
-
Data Migration and Transmission
- Overview
-
Migrating Data from External Data Sources to DLI
- Overview of Data Migration Scenarios
- Using CDM to Migrate Data to DLI
- Example Typical Scenario: Migrating Data from Hive to DLI
- Example Typical Scenario: Migrating Data from Kafka to DLI
- Example Typical Scenario: Migrating Data from Elasticsearch to DLI
- Example Typical Scenario: Migrating Data from RDS to DLI
- Example Typical Scenario: Migrating Data from GaussDB(DWS) to DLI
-
Configuring DLI to Read and Write Data from and to External Data Sources
- Configuring DLI to Read and Write External Data Sources
- Configuring the Network Connection Between DLI and Data Sources (Enhanced Datasource Connection)
- Using DEW to Manage Access Credentials for Data Sources
- Using DLI Datasource Authentication to Manage Access Credentials for Data Sources
-
Managing Enhanced Datasource Connections
- Viewing Basic Information About an Enhanced Datasource Connection
- Enhanced Connection Permission Management
- Binding an Enhanced Datasource Connection to an Elastic Resource Pool
- Unbinding an Enhanced Datasource Connection from an Elastic Resource Pool
- Adding a Route for an Enhanced Datasource Connection
- Deleting the Route for an Enhanced Datasource Connection
- Modifying Host Information in an Elastic Resource Pool
- Enhanced Datasource Connection Tag Management
- Deleting an Enhanced Datasource Connection
- Example Typical Scenario: Connecting DLI to a Data Source on a Private Network
- Example Typical Scenario: Connecting DLI to a Data Source on a Public Network
- Configuring an Agency to Allow DLI to Access Other Cloud Services
- Submitting a SQL Job Using DLI
- Submitting a Flink Job Using DLI
- Submitting a Spark Job Using DLI
- Submitting a DLI Job Using a Notebook Instance
- Using Cloud Eye to Monitor DLI
- Using CTS to Audit DLI
- Permissions Management
- Common DLI Management Operations
-
Best Practices
- Overview
- Analyzing Driving Behavior Data in IoV Scenarios Using DLI
- Converting Data Format from CSV to Parquet
- Analyzing E-Commerce BI Reports Using DLI
- Analyzing Billing Consumption Data Using DLI
- Analyzing Real-time E-Commerce Business Data Using DLI
-
Connecting BI Tools to DLI for Data Analysis
- Overview
- Configuring DBeaver to Connect to DLI for Data Query and Analysis
- Configuring DBT to Connect to DLI for Data Scheduling and Analysis
- Configuring Yonghong BI to Connect to DLI for Data Query and Analysis
- Configuring Superset to Connect to DLI for Data Query and Analysis
- Configuring Power BI to Connect to DLI Using Kyuubi for Data Query and Analysis
- Configuring FineBI to Connect to DLI Using Kyuubi for Data Query and Analysis
- Configuring Tableau to Connect to DLI Using Kyuubi for Data Query and Analysis
- Configuring Beeline to Connect to DLI Using Kyuubi for Data Query and Analysis
-
Developer Guide
- Connecting to DLI Using a Client
- SQL Jobs
-
Flink Jobs
- Stream Ecosystem
-
Flink OpenSource SQL Jobs
- Reading Data from Kafka and Writing Data to RDS
- Reading Data from Kafka and Writing Data to GaussDB(DWS)
- Reading Data from Kafka and Writing Data to Elasticsearch
- Reading Data from MySQL CDC and Writing Data to GaussDB(DWS)
- Reading Data from PostgreSQL CDC and Writing Data to GaussDB(DWS)
- Configuring High-Reliability Flink Jobs (Automatic Restart upon Exceptions)
- Flink Jar Job Examples
- Writing Data to OBS Using Flink Jar
- Using Flink Jar to Connect to Kafka that Uses SASL_SSL Authentication
- Using Flink Jar to Read and Write Data from and to DIS
- Flink Job Agencies
-
Spark Jar Jobs
- Using Spark Jar Jobs to Read and Query OBS Data
- Using the Spark Job to Access DLI Metadata
- Using Spark Jobs to Access Data Sources of Datasource Connections
- Spark Jar Jobs Using DEW to Acquire Access Credentials for Reading and Writing Data from and to OBS
- Obtaining Temporary Credentials from a Spark Job's Agency for Accessing Other Cloud Services
-
SQL Syntax Reference
-
Spark SQL Syntax Reference
- Common Configuration Items
- Spark SQL Syntax
- Spark Open Source Commands
- Databases
-
Tables
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Viewing a Table
- Modifying a Table
-
Partition-related Syntax
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Backing Up and Restoring Data of Multiple Versions
- Table Lifecycle Management
- Data
- Exporting Query Results
-
Datasource Connections
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS Table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Viewing the Execution Plan
- Data Permissions
- Data Types
- User-Defined Functions
-
Built-In Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- SELECT
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- class_name
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- hdfs_path
- input_expression
- input_format_classname
- jar_path
- join_condition
- non_equi_join_condition
- number
- num_buckets
- output_format_classname
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- row_format
- select_statement
- separator
- serde_name
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- view_name
- view_properties
- when_expression
- where_condition
- window_function
- Operators
-
Flink SQL Syntax Reference
-
Flink OpenSource SQL 1.15 Syntax Reference
- Constraints and Definitions
- Overview
- Flink OpenSource SQL 1.15 Usage
- Formats
- Connectors
- DML Snytax
-
Functions
- UDFs
- Type Inference
- Parameter Transfer
-
Built-In Functions
- Comparison Functions
- Logical Functions
- Arithmetic Functions
- String Functions
- Temporal Functions
- Conditional Functions
- Type Conversion Functions
- Collection Functions
- JSON Functions
- Value Construction Functions
- Value Retrieval Functions
- Grouping Functions
- Hash Functions
- Aggregate Functions
- Table-Valued Functions
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Flink OpenSource SQL 1.15 Syntax Reference
-
HetuEngine SQL Syntax Reference
-
HetuEngine SQL Syntax
- Before You Start
- Data Type
-
DDL Syntax
- CREATE SCHEMA
- CREATE TABLE
- CREATE TABLE AS
- CREATE TABLE LIKE
- CREATE VIEW
- ALTER TABLE
- ALTER VIEW
- ALTER SCHEMA
- DROP SCHEMA
- DROP TABLE
- DROP VIEW
- TRUNCATE TABLE
- COMMENT
- VALUES
- SHOW Syntax Overview
- SHOW SCHEMAS (DATABASES)
- SHOW TABLES
- SHOW TBLPROPERTIES TABLE|VIEW
- SHOW TABLE/PARTITION EXTENDED
- SHOW FUNCTIONS
- SHOW PARTITIONS
- SHOW COLUMNS
- SHOW CREATE TABLE
- SHOW VIEWS
- SHOW CREATE VIEW
- DML Syntax
- DQL Syntax
- Auxiliary Command Syntax
- Reserved Keywords
-
SQL Functions and Operators
- Logical Operators
- Comparison Functions and Operators
- Condition Expression
- Lambda Expression
- Conversion Functions
- Mathematical Functions and Operators
- Bitwise Functions
- Decimal Functions and Operators
- String Functions and Operators
- Regular Expressions
- Binary Functions and Operators
- JSON Functions and Operators
- Date and Time Functions and Operators
- Aggregate Functions
- Window Functions
- Array Functions and Operators
- Map Functions and Operators
- URL Function
- UUID Function
- Color Function
- Teradata Function
- Data Masking Functions
- IP Address Functions
- Quantile Digest Functions
- T-Digest Functions
- Implicit Data Type Conversion
- Appendix
-
HetuEngine SQL Syntax
-
Hudi SQL Syntax Reference
- Hudi Table Overview
- DLI Hudi Metadata
- DLI Hudi Development Specifications
- Using Hudi to Develop Jobs in DLI
- DLI Hudi SQL Syntax Reference
- Spark DataSource API Syntax Reference
- Data Management and Maintenance
- Typical Hudi Configuration Parameters
- Delta SQL Syntax Reference
-
Spark SQL Syntax Reference
-
API Reference
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Global Variable-related APIs
- APIs Related to Resource Tags
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Listing Enhanced Datasource Connections
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Creating a Route
- Deleting a Route
- Datasource Authentication-related APIs
-
APIs Related to Elastic Resource Pools
- Creating an Elastic Resource Pool
- Querying All Elastic Resource Pools
- Deleting an Elastic Resource Pool
- Modifying Elastic Resource Pool Information
- Querying All Queues in an Elastic Resource Pool
- Associating a Queue with an Elastic Resource Pool
- Viewing Scaling History of an Elastic Resource Pool
- Modifying the Scaling Policy of a Queue Associated with an Elastic Resource Pool
- Queue-related APIs (Recommended)
- SQL Job-related APIs
- SQL Template-related APIs
-
Flink Job-related APIs
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Listing Jobs
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- Generating a Static Stream Graph for a Flink SQL Job
- APIs Related to Flink Job Templates
- Flink Job Management APIs
- Spark Job-related APIs
- APIs Related to Spark Job Templates
- Permissions Policies and Supported Actions
-
Out-of-Date APIs
- Agency-related APIs (Discarded)
-
Package Group-related APIs (Discarded)
- Uploading a Package Group (Discarded)
- Listing Package Groups (Discarded)
- Uploading a JAR Package Group (Discarded)
- Uploading a PyFile Package Group (Discarded)
- Uploading a File Package Group (Discarded)
- Querying Resource Packages in a Group (Discarded)
- Deleting a Resource Package from a Group (Discarded)
- Changing the Owner of a Group or Resource Package (Discarded)
- APIs Related to Spark Batch Processing (Discarded)
- SQL Job-related APIs (Discarded)
- Resource-related APIs (Discarded)
- Permission-related APIs (Discarded)
- Queue-related APIs (Discarded)
- Datasource Authentication-related APIs (Discarded)
- APIs Related to Enhanced Datasource Connections (Discarded)
- Template-related APIs (Discarded)
- Table-related APIs (Discarded)
- APIs Related to SQL Jobs (Discarded)
- APIs Related to Data Upload (Discarded)
- Cluster-related APIs
- APIs Related to Flink Jobs (Discarded)
- Public Parameters
- SDK Reference
-
FAQs
-
DLI Basics
- What Are the Differences Between DLI Flink and MRS Flink?
- What Are the Differences Between MRS Spark and DLI Spark?
- How Do I Upgrade the Engine Version of a DLI Job?
- Where Can Data Be Stored in DLI?
- Can I Import OBS Bucket Data Shared by Other Tenants into DLI?
- Regions and AZs
- Can a Member Account Use Global Variables Created by Other Member Accounts?
- Is DLI Affected by the Apache Spark Command Injection Vulnerability (CVE-2022-33891)?
- How Do I Manage Jobs Running on DLI?
- How Do I Change the Field Names of an Existing Table on DLI?
-
DLI Elastic Resource Pools and Queues
- How Can I Check the Actual and Used CUs for an Elastic Resource Pool as Well as the Required CUs for a Job?
- How Do I Check for a Backlog of Jobs in the Current DLI Queue?
- How Do I View the Load of a DLI Queue?
- How Do I Monitor Job Exceptions on a DLI Queue?
- How Do I Migrate an Old Version Spark Queue to a General-Purpose Queue?
- How Do I Do If I Encounter a Timeout Exception When Executing DLI SQL Statements on the default Queue?
-
DLI Databases and Tables
- Why Am I Unable to Query a Table on the DLI Console?
- How Do I Do If the Compression Rate of an OBS Table Is High?
- How Do I Do If Inconsistent Character Encoding Leads to Garbled Characters?
- Do I Need to to Regrant Permissions to Users and Projects After Deleting and Recreating a Table With the Same Name?
- How Do I Do If Files Imported Into a DLI Partitioned Table Lack Data for the Partition Columns, Causing Query Failures After the Import Is Completed?
- How Do I Fix Incorrect Data in an OBS Foreign Table Caused by Newline Characters in OBS File Fields?
- How Do I Prevent a Cartesian Product Query and Resource Overload Due to Missing "ON" Conditions in Table Joins?
- How Do I Do If I Can't Query Data After Manually Adding It to the Partition Directory of an OBS Table?
- Why Does the "insert overwrite" Operation Affect All Data in a Partitioned Table Instead of Just the Targeted Partition?
- Why Does the "create_date" Field in an RDS Table (Datetime Data Type) Appear as a Timestamp in DLI Queries?
- How Do I Do If Renaming a Table After a SQL Job Causes Incorrect Data Size?
- How Can I Resolve Data Inconsistencies When Importing Data from DLI to OBS?
-
Enhanced Datasource Connections
- How Do I Do If I Can't Bind an Enhanced Datasource Connection to a Queue?
- How Do I Resolve a Failure in Connecting DLI to GaussDB(DWS) Through an Enhanced Datasource Connection?
- How Do I Do If the Datasource Connection Is Successfully Created but the Network Connectivity Test Fails?
- How Do I Configure Network Connectivity Between a DLI Queue and a Data Source?
- Why Is Creating a VPC Peering Connection Necessary for Enhanced Datasource Connections in DLI?
- How Do I Do If Creating a Datasource Connection in DLI Gets Stuck in the "Creating" State When Binding It to a Queue?
- How Do I Resolve the "communication link failure" Error When Using a Newly Created Datasource Connection That Appears to Be Activated?
- How Do I Troubleshoot a Connection Timeout Issue That Isn't Recorded in Logs When Accessing MRS HBase Through a Datasource Connection?
- How Do I Fix the "Failed to get subnet" Error When Creating a Datasource Connection in DLI?
- How Do I Do If I Encounter the "Incorrect string value" Error When Executing insert overwrite on a Datasource RDS Table?
- How Do I Resolve the Null Pointer Error When Creating an RDS Datasource Table?
- Error Message "org.postgresql.util.PSQLException: ERROR: tuple concurrently updated" Is Displayed When the System Executes insert overwrite on a Datasource GaussDB(DWS) Table
- RegionTooBusyException Is Reported When Data Is Imported to a CloudTable HBase Table Through a Datasource Table
- How Do I Do If A Null Value Is Written Into a Non-Null Field When Using a DLI Datasource Connection to Connect to a GaussDB(DWS) Table?
- How Do I Do If an Insert Operation Failed After the Schema of the GaussDB(DWS) Source Table Is Updated?
- How Do I Insert Data into an RDS Table with an Auto-Increment Primary Key Using DLI?
-
SQL Jobs
-
SQL Job Development
- SQL Jobs
- How Do I Merge Small Files?
- How Do I Use DLI to Access Data in an OBS Bucket?
- How Do I Specify an OBS Path When Creating an OBS Table?
- How Do I Create a Table Using JSON Data in an OBS Bucket?
- How Can I Use the count Function to Perform Aggregation?
- How Do I Synchronize DLI Table Data Across Regions?
- How Do I Insert Table Data into Specific Fields of a Table Using a SQL Job?
- How Do I Troubleshoot Slow SQL Jobs?
- How Do I View DLI SQL Logs?
- How Do I View SQL Execution Records in DLI?
- How Do I Do When Data Skew Occurs During the Execution of a SQL Job?
- Why Does a SQL Job That Has Join Operations Stay in the Running State?
- Why Is a SQL Job Stuck in the Submitting State?
-
SQL Job O&M
- Why Is Error "path obs://xxx already exists" Reported When Data Is Exported to OBS?
- Why Is Error "SQL_ANALYSIS_ERROR: Reference 't.id' is ambiguous, could be: t.id, t.id.;" Displayed When Two Tables Are Joined?
- Why Is Error "The current account does not have permission to perform this operation,the current account was restricted. Restricted for no budget." Reported when a SQL Statement Is Executed?
- Why Is Error "There should be at least one partition pruning predicate on partitioned table XX.YYY" Reported When a Query Statement Is Executed?
- Why Is Error "IllegalArgumentException: Buffer size too small. size" Reported When Data Is Loaded to an OBS Foreign Table?
- Why Is Error "DLI.0002 FileNotFoundException" Reported During SQL Job Running?
- Why Is a Schema Parsing Error Reported When I Create a Hive Table Using CTAS?
- Why Is Error "org.apache.hadoop.fs.obs.OBSIOException" Reported When I Run DLI SQL Scripts on DataArts Studio?
- Why Is Error "UQUERY_CONNECTOR_0001:Invoke DLI service api failed" Reported in the Job Log When I Use CDM to Migrate Data to DLI?
- Why Is Error "File not Found" Reported When I Access a SQL Job?
- Why Is Error "DLI.0003: AccessControlException XXX" Reported When I Access a SQL Job?
- Why Is Error "DLI.0001: org.apache.hadoop.security.AccessControlException: verifyBucketExists on {{bucket name}}: status [403]" Reported When I Access a SQL Job?
- Why Am I Seeing the Error Message "The current account does not have permission to perform this operation,the current account was restricted. Restricted for no budget" When Executing a SQL Statement?
-
SQL Job Development
-
Flink Jobs
-
Flink Job Consulting
- How Do I Authorize a Subuser to View Flink Jobs?
- How Do I Configure Auto Restart upon Exception for a Flink Job?
- How Do I Save Logs for Flink Jobs?
- Why Is Error "No such user. userName:xxxx." Reported on the Flink Job Management Page When I Grant Permission to a User?
- How Do I Restore a Flink Job from a Specific Checkpoint After Manually Stopping the Job?
- Why Is a Message Displayed Indicating That the SMN Topic Does Not Exist When I Use the SMN Topic in DLI?
-
Flink SQL Jobs
- How Do I Map an OBS Table to a DLI Partitioned Table?
- How Do I Change the Number of Kafka Partitions in a Flink SQL Job Without Stopping It?
- How Do I Fix the DLI.0005 Error When Using EL Expressions to Create a Table in a Flink SQL Job?
- Why Is No Data Queried in the DLI Table Created Using the OBS File Path When Data Is Written to OBS by a Flink Job Output Stream?
- Why Does a Flink SQL Job Fails to Be Executed, and Is "connect to DIS failed java.lang.IllegalArgumentException: Access key cannot be null" Displayed in the Log?
- Data Writing Fails After a Flink SQL Job Consumed Kafka and Sank Data to the Elasticsearch Cluster
- How Does Flink Opensource SQL Parse Nested JSON?
- Why Is the Time Read by a Flink OpenSource SQL Job from the RDS Database Is Different from the RDS Database Time?
- Why Does Job Submission Fail When the failure-handler Parameter of the Elasticsearch Result Table for a Flink Opensource SQL Job Is Set to retry_rejected?
- How Do I Configure Connection Retries for Kafka Sink If it is Disconnected?
- How Do I Write Data to Different Elasticsearch Clusters in a Flink Job?
- Why Does DIS Stream Not Exist During Job Semantic Check?
- Why Is Error "Timeout expired while fetching topic metadata" Repeatedly Reported in Flink JobManager Logs?
-
Flink Jar Jobs
- Can I Upload Configuration Files for Flink Jar Jobs?
- Why Does a Flink Jar Package Conflict Result in Job Submission Failure?
- Why Does a Flink Jar Job Fail to Access GaussDB(DWS) and a Message Is Displayed Indicating Too Many Client Connections?
- Why Is Error Message "Authentication failed" Displayed During Flink Jar Job Running?
- Why Is Error Invalid OBS Bucket Name Reported After a Flink Job Submission Failed?
- Why Does the Flink Submission Fail Due to Hadoop JAR File Conflict?
- How Do I Locate a Flink Job Submission Error?
-
Flink Job Performance Tuning
- What Is the Recommended Configuration for a Flink Job?
- Flink Job Performance Tuning
- How Do I Prevent Data Loss After Flink Job Restart?
- How Do I Locate a Flink Job Running Error?
- How Can I Check if a Flink Job Can Be Restored From a Checkpoint After Restarting It?
- Why Are Logs Not Written to the OBS Bucket After a DLI Flink Job Fails to Be Submitted for Running?
- Why Is the Flink Job Abnormal Due to Heartbeat Timeout Between JobManager and TaskManager?
-
Flink Job Consulting
-
Spark Jobs
-
Spark Job Development
- Spark Jobs
- How Do I Use Spark to Write Data into a DLI Table?
- How Do I Set Up AK/SK So That a General Queue Can Access Tables Stored in OBS?
- How Do I View the Resource Usage of DLI Spark Jobs?
- How Do I Use Python Scripts to Access the MySQL Database If the pymysql Module Is Missing from the Spark Job Results Stored in MySQL?
- How Do I Run a Complex PySpark Program in DLI?
- How Do I Use JDBC to Set the spark.sql.shuffle.partitions Parameter to Improve the Task Concurrency?
- How Do I Read Uploaded Files for a Spark Jar Job?
- Why Can't I Find the Specified Python Environment After Adding the Python Package?
- Why Is a Spark Jar Job Stuck in the Submitting State?
-
Spark Job O&M
- What Can I Do When Receiving java.lang.AbstractMethodError in the Spark Job?
- Why Do I Get "ResponseCode: 403" and "ResponseStatus: Forbidden" Errors When a Spark Job Accesses OBS Data?
- Why Do I Encounter the Error "verifyBucketExists on XXXX: status [403]" When Using a Spark Job to Access an OBS Bucket That I Have Permission to Access?
- Why Does a Job Running Timeout Occur When Processing a Large Amount of Data with a Spark Job?
- Why Does a Spark Job Fail to Execute with an Abnormal Access Directory Error When Accessing Files in SFTP?
- Why Does the Job Fail to Be Executed Due to Insufficient Database and Table Permissions?
- Why Is the global_temp Database Missing in the Job Log of Spark 3.x?
- Why Does Using DataSource Syntax to Create an OBS Table of Avro Type Fail When Accessing Metadata With Spark 2.3.x?
-
Spark Job Development
- DLI Resource Quotas
-
DLI Permissions Management
- How Do I Do If I Receive an Error Message Stating That I Do Not Have Sufficient Permissions When Creating a Table After Upgrading the Engine Version of a Queue?
- What Is Column-Level Authorization for DLI Partitioned Tables?
- How Do I Do If I Encounter Insufficient Permissions While Updating Packages?
- Why Is Error "DLI.0003: Permission denied for resource..." Reported When I Run a SQL Statement?
- How Do I Do If I Can't Query Table Data After Being Granted Table Permissions?
- Will Granting Duplicate Permissions to a Table After Inheriting Database Permissions Cause an Error?
- Why Can't I Query a View After I'm Granted the Select Table Permission on the View?
- How Do I Do If I Receive a Message Saying I Don't Have Sufficient Permissions to Submit My Jobs to the Job Bucket?
- How Do I Resolve an Unauthorized OBS Bucket Error?
- DLI APIs
-
DLI Basics
-
More Documents
-
User Guide (ME-Abu Dhabi Region)
- DLI Introduction
- Getting Started
- DLI Console Overview
- SQL Editor
- Job Management
- Queue Management
- Data Management
- Job Templates
- Datasource Connections
- Global Configuration
- UDFs
- Permissions Management
- Change History
-
API Reference (ME-Abu Dhabi Region)
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Queue-related APIs (Recommended)
- APIs Related to SQL Jobs
- Package Group-related APIs
-
APIs Related to Flink Jobs
- Granting OBS Permissions to DLI
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Querying the Job List
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- APIs Related to Spark jobs
- APIs Related to Flink Job Templates
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Querying an Enhanced Datasource Connection List
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying the Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Global Variable-related APIs
- Public Parameters
- Change History
-
SQL Syntax Reference (ME-Abu Dhabi Region)
-
Spark SQL Syntax Reference
- Common Configuration Items of Batch SQL Jobs
- SQL Syntax Overview of Batch Jobs
- Spark Open Source Commands
- Databases
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Checking Tables
- Modifying a Table
-
Syntax for Partitioning a Table
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Importing Data to the Table
- Inserting Data
- Clearing Data
- Exporting Search Results
- Backing Up and Restoring Data of Multiple Versions
- Table Lifecycle Management
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Checking the Execution Plan
- Data Permissions Management
- Data Types
- User-Defined Functions
-
Built-in Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- Basic SELECT Statements
- Filtering
- Sorting
- Grouping
- JOIN
- Subquery
- Alias
- Set Operations
- WITH...AS
- CASE...WHEN
- OVER Clause
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Historical Versions
-
Flink SQL Syntax
- SQL Syntax Constraints and Definitions
- SQL Syntax Overview of Stream Jobs
- Creating a Source Stream
-
Creating a Sink Stream
- MRS OpenTSDB Sink Stream
- CSS Elasticsearch Sink Stream
- DCS Sink Stream
- DDS Sink Stream
- DIS Sink Stream
- DMS Sink Stream
- DWS Sink Stream (JDBC Mode)
- DWS Sink Stream (OBS-based Dumping)
- MRS HBase Sink Stream
- MRS Kafka Sink Stream
- Open-Source Kafka Sink Stream
- File System Sink Stream (Recommended)
- OBS Sink Stream
- RDS Sink Stream
- SMN Sink Stream
- Creating a Temporary Stream
- Creating a Dimension Table
- Custom Stream Ecosystem
- Data Type
- Built-In Functions
- User-Defined Functions
- Geographical Functions
- SELECT
- Condition Expression
- Window
- JOIN Between Stream Data and Table Data
- Configuring Time Models
- Pattern Matching
- StreamingML
- Reserved Keywords
-
Flink SQL Syntax
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- input_expression
- join_condition
- non_equi_join_condition
- number
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- select_statement
- separator
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- when_expression
- where_condition
- window_function
- Operators
-
Spark SQL Syntax Reference
- User Guide (Paris Region)
-
API Reference (Paris Region)
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Queue-related APIs (Recommended)
- APIs Related to SQL Jobs
- Package Group-related APIs
-
APIs Related to Flink Jobs
- Granting OBS Permissions to DLI
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Querying the Job List
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- APIs Related to Spark jobs
- APIs Related to Flink Job Templates
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Querying an Enhanced Datasource Connection List
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying the Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Global Variable-related APIs
- Public Parameters
- Change History
-
SQL Syntax Reference (Paris Region)
-
Spark SQL Syntax Reference
- Common Configuration Items of Batch SQL Jobs
- SQL Syntax Overview of Batch Jobs
- Spark Open Source Commands
- Databases
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Checking Tables
- Modifying a Table
-
Syntax for Partitioning a Table
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Importing Data to the Table
- Inserting Data
- Clearing Data
- Exporting Search Results
- Table Lifecycle Management
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Checking the Execution Plan
- Data Permissions Management
- Data Types
- User-Defined Functions
-
Built-in Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- Basic SELECT Statements
- Filtering
- Sorting
- Grouping
- JOIN
- Subquery
- Alias
- Set Operations
- WITH...AS
- CASE...WHEN
- OVER Clause
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Historical Versions
-
Flink SQL Syntax
- SQL Syntax Constraints and Definitions
- SQL Syntax Overview of Stream Jobs
- Creating a Source Stream
-
Creating a Sink Stream
- MRS OpenTSDB Sink Stream
- CSS Elasticsearch Sink Stream
- DCS Sink Stream
- DDS Sink Stream
- DIS Sink Stream
- DMS Sink Stream
- DWS Sink Stream (JDBC Mode)
- DWS Sink Stream (OBS-based Dumping)
- MRS HBase Sink Stream
- MRS Kafka Sink Stream
- Open-Source Kafka Sink Stream
- File System Sink Stream (Recommended)
- OBS Sink Stream
- RDS Sink Stream
- SMN Sink Stream
- Creating a Temporary Stream
- Creating a Dimension Table
- Custom Stream Ecosystem
- Data Type
- Built-In Functions
- User-Defined Functions
- Geographical Functions
- SELECT
- Condition Expression
- Window
- JOIN Between Stream Data and Table Data
- Configuring Time Models
- Pattern Matching
- StreamingML
- Reserved Keywords
-
Flink SQL Syntax
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- input_expression
- join_condition
- non_equi_join_condition
- number
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- select_statement
- separator
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- when_expression
- where_condition
- window_function
- Operators
-
Spark SQL Syntax Reference
-
User Guide (Kuala Lumpur Region)
- DLI Introduction
- Getting Started
- DLI Console Overview
- SQL Editor
- Job Management
- Queue Management
- Data Management
- Job Templates
- Datasource Connections
- Global Configuration
- UDFs
- Permissions Management
- Change History
-
API Reference (Kuala Lumpur Region)
- Before You Start
- Overview
- Calling APIs
- Getting Started
- Permission-related APIs
- Queue-related APIs (Recommended)
- APIs Related to SQL Jobs
- Package Group-related APIs
-
APIs Related to Flink Jobs
- Granting OBS Permissions to DLI
- Creating a SQL Job
- Updating a SQL Job
- Creating a Flink Jar job
- Updating a Flink Jar Job
- Running Jobs in Batches
- Querying the Job List
- Querying Job Details
- Querying the Job Execution Plan
- Stopping Jobs in Batches
- Deleting a Job
- Deleting Jobs in Batches
- Exporting a Flink Job
- Importing a Flink Job
- APIs Related to Spark jobs
- APIs Related to Flink Job Templates
-
APIs Related to Enhanced Datasource Connections
- Creating an Enhanced Datasource Connection
- Deleting an Enhanced Datasource Connection
- Querying an Enhanced Datasource Connection List
- Querying an Enhanced Datasource Connection
- Binding a Queue
- Unbinding a Queue
- Modifying the Host Information
- Querying Authorization of an Enhanced Datasource Connection
- Global Variable-related APIs
- Permissions Policies and Supported Actions
- Public Parameters
- Change History
-
SQL Syntax Reference (Kuala Lumpur Region)
-
Spark SQL Syntax Reference
- Common Configuration Items of Batch SQL Jobs
- SQL Syntax Overview of Batch Jobs
- Spark Open Source Commands
- Databases
- Creating an OBS Table
- Creating a DLI Table
- Deleting a Table
- Checking Tables
- Modifying a Table
-
Syntax for Partitioning a Table
- Adding Partition Data (Only OBS Tables Supported)
- Renaming a Partition (Only OBS Tables Supported)
- Deleting a Partition
- Deleting Partitions by Specifying Filter Criteria (Only Supported on OBS Tables)
- Altering the Partition Location of a Table (Only OBS Tables Supported)
- Updating Partitioned Table Data (Only OBS Tables Supported)
- Updating Table Metadata with REFRESH TABLE
- Importing Data to the Table
- Inserting Data
- Clearing Data
- Exporting Search Results
- Backing Up and Restoring Data of Multiple Versions
- Table Lifecycle Management
- Creating a Datasource Connection with an HBase Table
- Creating a Datasource Connection with an OpenTSDB Table
- Creating a Datasource Connection with a DWS table
- Creating a Datasource Connection with an RDS Table
- Creating a Datasource Connection with a CSS Table
- Creating a Datasource Connection with a DCS Table
- Creating a Datasource Connection with a DDS Table
- Creating a Datasource Connection with an Oracle Table
- Views
- Checking the Execution Plan
- Data Permissions Management
- Data Types
- User-Defined Functions
-
Built-in Functions
-
Date Functions
- Overview
- add_months
- current_date
- current_timestamp
- date_add
- dateadd
- date_sub
- date_format
- datediff
- datediff1
- datepart
- datetrunc
- day/dayofmonth
- from_unixtime
- from_utc_timestamp
- getdate
- hour
- isdate
- last_day
- lastday
- minute
- month
- months_between
- next_day
- quarter
- second
- to_char
- to_date
- to_date1
- to_utc_timestamp
- trunc
- unix_timestamp
- weekday
- weekofyear
- year
-
String Functions
- Overview
- ascii
- concat
- concat_ws
- char_matchcount
- encode
- find_in_set
- get_json_object
- instr
- instr1
- initcap
- keyvalue
- length
- lengthb
- levenshtein
- locate
- lower/lcase
- lpad
- ltrim
- parse_url
- printf
- regexp_count
- regexp_extract
- replace
- regexp_replace
- regexp_replace1
- regexp_instr
- regexp_substr
- repeat
- reverse
- rpad
- rtrim
- soundex
- space
- substr/substring
- substring_index
- split_part
- translate
- trim
- upper/ucase
- Mathematical Functions
- Aggregate Functions
- Window Functions
- Other Functions
-
Date Functions
- Basic SELECT Statements
- Filtering
- Sorting
- Grouping
- JOIN
- Subquery
- Alias
- Set Operations
- WITH...AS
- CASE...WHEN
- OVER Clause
- Flink OpenSource SQL 1.12 Syntax Reference
-
Flink Opensource SQL 1.10 Syntax Reference
- Constraints and Definitions
- Flink OpenSource SQL 1.10 Syntax
-
Data Definition Language (DDL)
- Creating a Source Table
-
Creating a Result Table
- ClickHouse Result Table
- Kafka Result Table
- Upsert Kafka Result Table
- DIS Result Table
- JDBC Result Table
- GaussDB(DWS) Result Table
- Redis Result Table
- SMN Result Table
- HBase Result Table
- Elasticsearch Result Table
- OpenTSDB Result Table
- User-defined Result Table
- Print Result Table
- File System Result Table
- Creating a Dimension Table
- Data Manipulation Language (DML)
- Functions
-
Historical Versions
-
Flink SQL Syntax
- SQL Syntax Constraints and Definitions
- SQL Syntax Overview of Stream Jobs
- Creating a Source Stream
-
Creating a Sink Stream
- CloudTable HBase Sink Stream
- CloudTable OpenTSDB Sink Stream
- MRS OpenTSDB Sink Stream
- CSS Elasticsearch Sink Stream
- DCS Sink Stream
- DDS Sink Stream
- DIS Sink Stream
- DMS Sink Stream
- DWS Sink Stream (JDBC Mode)
- DWS Sink Stream (OBS-based Dumping)
- MRS HBase Sink Stream
- MRS Kafka Sink Stream
- Open-Source Kafka Sink Stream
- File System Sink Stream (Recommended)
- OBS Sink Stream
- RDS Sink Stream
- SMN Sink Stream
- Creating a Temporary Stream
- Creating a Dimension Table
- Custom Stream Ecosystem
- Data Type
- Built-In Functions
- User-Defined Functions
- Geographical Functions
- SELECT
- Condition Expression
- Window
- JOIN Between Stream Data and Table Data
- Configuring Time Models
- Pattern Matching
- StreamingML
- Reserved Keywords
-
Flink SQL Syntax
-
Identifiers
- aggregate_func
- alias
- attr_expr
- attr_expr_list
- attrs_value_set_expr
- boolean_expression
- col
- col_comment
- col_name
- col_name_list
- condition
- condition_list
- cte_name
- data_type
- db_comment
- db_name
- else_result_expression
- file_format
- file_path
- function_name
- groupby_expression
- having_condition
- input_expression
- join_condition
- non_equi_join_condition
- number
- partition_col_name
- partition_col_value
- partition_specs
- property_name
- property_value
- regex_expression
- result_expression
- select_statement
- separator
- sql_containing_cte_name
- sub_query
- table_comment
- table_name
- table_properties
- table_reference
- when_expression
- where_condition
- window_function
- Operators
-
Spark SQL Syntax Reference
-
User Guide (ME-Abu Dhabi Region)
- Videos
- General Reference
- Scenario
- Solution Overview
- Process
- Solution Advantages
- Resource Planning and Costs
- Example Data
- Step 1: Creating Resources
- Step 2: Obtaining the DMS Connection Address and Creating a Topic
- Step 3: Creating an RDS Database Table
- Step 4: Creating an Enhanced Datasource Connection
- Step 5: Creating and Submitting a Flink Job
- Step 6: Querying the Result
Show all
Copied.
Analyzing Real-time E-Commerce Business Data Using DLI
Scenario
Online shopping is very popular for its convenience and flexibility. e-Commerce platform can be accessed via an array of methods, such as visiting the websites, using shopping apps, and accessing through mini-programs. A large volume of statistics data such as the real-time access volume, number of orders, and number of visitors needs to be collected and analyzed on each e-commerce platform every day. These data needs to be displayed in an intuitive way and updated in real time to help managers learn about data changes in a timely manner and adjust marketing policies accordingly. How can we efficiently and quickly collect statistics based on these metrics?
Assume the order information of each offering is written into Kafka in real time. The information includes the order ID, channel (websites or apps), order creation time, amount, actual payment amount after discount, payment time, user ID, username, and region ID. We need to collect statistics on such information based on metrics of each sales channel in real time, store the statistics in a database, and display the statistics on screens.
Solution Overview
The following figure gives you an overview to user DLI Flink to analyze and process real-time e-commerce business data and sales data of all channels.

Process
To analyze real-time e-commerce data with DLI Flink, perform the following steps:
Step 1: Creating Resources. Create resources required for creating jobs belong to your account, including VPC, DMS, DLI, and RDS.
Step 2: Obtaining the DMS Connection Address and Creating a Topic. Obtain the connection address of the DMS Kafka instance and create a DMS topic.
Step 3: Creating an RDS Database Table. Obtain the private IP address of the RDS DB instance and log in to the instance to create an RDS database and MySQL table.
Step 4: Creating an Enhanced Datasource Connection. Create an enhanced datasource connection for the queue and test the connectivity between the queue and the RDS instance and the queue and the DMS instance, respectively.
Step 5: Creating and Submitting a Flink Job. Create a DLI Flink OpenSource SQL job and run it.
Step 6: Querying the Result. Query the Flink job results and display the results on a screen in DLV.
Solution Advantages
- Cross-source analysis: You can perform association analysis on sales summary data of each channel stored in OBS. There is no need for data migration.
- SQL only: DLI has interconnected with multiple data sources. You can create tables using SQL statements to complete data source mapping.
Resource Planning and Costs
Resource |
Description |
Cost |
---|---|---|
OBS |
You need to create an OBS bucket and upload data to OBS for data analysis using DLI. |
You will be charged for using the following OBS resources:
The actual fee depends on the size of the stored file, the number of user access requests, and the traffic volume. Estimate the fee based on your service requirements. |
DLI |
Before creating a SQL job, you need to purchase a queue. When using queue resources, you are billed based on the CUH of the queue. |
For example, if you purchase a pay-per-use queue, you will be billed based on the number of CUHs used by the queue. Usage is billed by the hour, with any usage less than one hour being rounded up to one hour. The number of hours is calculated on the hour. CUH pay-per-use billing = Unit price x Number of CUs x Number of hours. |
VPC |
You can customize subnets, security groups, network ACLs, and assign EIPs and bandwidths. |
The VPC service is free of charge. EIPs are required if your resources need to access the Internet. EIP supports two billing modes: pay-per-use and yearly/monthly. For details, see VPC Billing. |
DMS Kafka |
Kafka provides premium instances with computing, storage, and exclusive bandwidth resources. |
Kafka supports two billing modes: pay-per-use and yearly/monthly. Billing items include Kafka instances and Kafka disk storage space. For details, see DMS for Kafka Billing. |
RDS MySQL |
RDS for MySQL provides online cloud database services. |
You are billed for RDS DB instances, database storage, and backup storage (optional). For details, see RDS Billing. |
DLV |
DLV adapts to a wide range of on-premise and cloud data sources, and provides diverse visualized components for you to quickly customize your data screens. |
If you use the DLV service, you will be charged for the purchased yearly/monthly DLV package. |
Example Data
- Order details wide table
Field
Data Type
Description
order_id
string
Order ID.
order_channel
string
Order channel (websites or apps)
order_time
string
Time
pay_amount
double
Order amount
real_pay
double
Actual amount paid
pay_time
string
Payment time
user_id
string
User ID
user_name
string
Username
area_id
string
Region ID
- Result table: real-time statistics of the total sales amount in each channel
Field
Data Type
Description
begin_time
varchar(32)
Start time for collecting statistics on metrics
channel_code
varchar(32)
Channel code
channel_name
varchar(32)
Channel
cur_gmv
double
Gross merchandises value (GMV) of the day
cur_order_user_count
bigint
Number of users who settled the payment in the day
cur_order_count
bigint
Number of orders paid on the day
last_pay_time
varchar(32)
Latest settlement time
flink_current_time
varchar(32)
Flink data processing time
Step 1: Creating Resources
Create VPC, DMS, RDS, DLI, and DLV resources listed in Table 2.
Resource |
Description |
Instructions |
---|---|---|
VPC |
A VPC manages network resources on the cloud. The network planning is described as follows:
|
|
DMS Kafka |
In this example, the DMS for Kafka instance is the data source. |
|
RDS MySQL |
In this example, an RDS for MySQL instance provides the cloud database service. |
|
DLI |
DLI provides real-time data analysis. Create a general-purpose queue that uses dedicated resources in yearly/monthly or pay-per-use billing mode. Otherwise, an enhanced network connection cannot be created. |
|
DLV |
DLV displays the result data processed by the DLI queue in real time. |
Step 2: Obtaining the DMS Connection Address and Creating a Topic
- Hover the mouth on the Service List icon and choose Distributed Message Service in Application. The DMS console is displayed. On the DMS for Kafka page, locate the Kafka instance you have created.
Figure 2 Kafka instances
- The instance details page is displayed. Obtain the Instance Address (Private Network) in the Connection pane.
Figure 3 Connection address
- Create a topic and name it trade_order_detail_info.
Figure 4 Creating a topic
Configure the required topic parameters as follows:
- Partitions: Set it to 1.
- Replicas: Set it to 1.
- Aging Time: Set it to 72 hours.
- Synchronous Flushing: Disable this function.
Step 3: Creating an RDS Database Table
- Log in to the console, hover your mouse over the service list icon and choose Relational Database Service in Databases. The RDS console is displayed. On the Instances page, locate the created DB instance and view its floating IP address.
Figure 5 Viewing the floating IP address
- Click More > Log In in the Operation column. On the displayed page, enter the username and password for logging in to the instance and click Test Connection. After Connection is successful is displayed, click Log In.
Figure 6 Logging in to an Instance
- Click Create Database. In the displayed dialog box, enter database name dli-demo. Then, click OK.
Figure 7 Creating a database
- Choose SQL Operation > SQL Query and run the following SQL statement to create a MySQL table for test (Example Data describes the fields):
DROP TABLE `dli-demo`.`trade_channel_collect`; CREATE TABLE `dli-demo`.`trade_channel_collect` ( `begin_time` VARCHAR(32) NOT NULL, `channel_code` VARCHAR(32) NOT NULL, `channel_name` VARCHAR(32) NULL, `cur_gmv` DOUBLE UNSIGNED NULL, `cur_order_user_count` BIGINT UNSIGNED NULL, `cur_order_count` BIGINT UNSIGNED NULL, `last_pay_time` VARCHAR(32) NULL, `flink_current_time` VARCHAR(32) NULL, PRIMARY KEY (`begin_time`, `channel_code`) ) ENGINE = InnoDB DEFAULT CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci COMMENT = 'Real-time statistics on the total sales amount of each channel';
Figure 8 Creating a table
Step 4: Creating an Enhanced Datasource Connection
- On the management console, hover the mouse on the service list icon and choose Analytics > Data Lake Insight. The DLI management console is displayed. Choose Resources > Queue Management to query the created DLI queue.
Figure 9 Queue list
- In the navigation pane of the DLI management console, choose Global Configuration > Service Authorization. On the displayed page, select VPC Administrator, and click Update to grant the DLI user the permission to access VPC resources. The permission is used to create a VPC peering connection.
Figure 10 Updating agency permissions
- Choose Datasource Connections. On the displayed Enhanced tab, click Create. Configure the following parameters, and click OK.
- Connection Name: Enter a name.
- Resource Pool: Select the general-purpose queue you have created.
- VPC: Select the VPC where the Kafka and MySQL instances are.
- Subnet: Select the subnet where the Kafka and MySQL instances are.
Figure 11 Creating an enhanced datasourceThe status of the created datasource connection is Active in the Enhanced tab.
Click the name of the datasource connection. On the details page, the connection status is ACTIVE.
Figure 12 Connection statusFigure 13 Details - Test whether the queue can connect to RDS for MySQL and DMS for Kafka instances, respectively.
- On the Queue Management page, locate the target queue. In the Operation column, click More > Test Address Connectivity.
Figure 14 Testing address connectivity
- Enter the connection address of the DMS for Kafka instance and the private IP address of the RDS for MySQL instance to test the connectivity.
If the test is successful, the DLI queue can connect to the Kafka and MySQL instances.
Figure 15 Testing address connectivityIf the test fails, modify the security group rules of the VPC where the Kafka and MySQL instances are to allow DLI queue access on ports 9092 and 3306. You can obtain the network segment of the queue on its details page.
Figure 16 VPC security group rules
- On the Queue Management page, locate the target queue. In the Operation column, click More > Test Address Connectivity.
Step 5: Creating and Submitting a Flink Job
- In the navigation pane on the left, choose Job Management > Flink Jobs. Click Create Job.
- Type: Select Flink OpenSource SQL.
- Name: Enter a name.
Figure 17 Creating a Flink Job - Click OK. The job editing page is displayed. The following is a simple SQL statement. You need to modify some parameter values based on the RDS and DMS instance information.
--********************************************************************-- -- Data source: trade_order_detail_info (order details wide table) --********************************************************************-- create table trade_order_detail ( order_id string, -- Order ID order_channel string, -- Channel order_time string, -- Order creation time pay_amount double, -- Order amount real_pay double, -- Actual payment amount pay_time string, -- Payment time user_id string, -- User ID user_name string, -- Username area_id string -- Region ID ) with ( "connector.type" = "kafka", "connector.version" = "0.10", "connector.properties.bootstrap.servers" = "xxxx:9092,xxxx:9092,xxxx:9092", -- Kafka connection address "connector.properties.group.id" = "trade_order", -- Kafka groupID "connector.topic" = "trade_order_detail_info", -- Kafka topic "format.type" = "json", "connector.startup-mode" = "latest-offset" ); --********************************************************************-- -- Result table: trade_channel_collect (real-time statistics on the total sales amount of each channel) --********************************************************************-- create table trade_channel_collect( begin_time string, --Start time of statistics collection channel_code string, -- Channel ID channel_name string, -- Channel name cur_gmv double, -- GMV cur_order_user_count bigint, -- Number of payers cur_order_count bigint, -- Number of orders paid on the day last_pay_time string, -- Latest settlement time flink_current_time string, primary key (begin_time, channel_code) not enforced ) with ( "connector.type" = "jdbc", "connector.url" = "jdbc:mysql://xxxx:3306/xxxx", -- MySQL connection address, in JDBC format "connector.table" = "xxxx", -- MySQL table name "connector.driver" = "com.mysql.jdbc.Driver", 'pwd_auth_name'= 'xxxxx', -- Name of the datasource authentication of the password type created on DLI. If datasource authentication is used, you do not need to set the username and password for the job. "connector.write.flush.max-rows" = "1000", "connector.write.flush.interval" = "1s" ); --********************************************************************-- -- Temporary intermediate table --********************************************************************-- create view tmp_order_detail as select * , case when t.order_channel not in ("webShop", "appShop", "miniAppShop") then "other" else t.order_channel end as channel_code -- Redefine channels. Only four enumeration values are available: webShop, appShop, miniAppShop, and other. , case when t.order_channel = "webShop" then _UTF16"Website" when t.order_channel = "appShop" then _UTF16"Shopping App" when t.order_channel = "miniAppShop" then _UTF16" Miniprogram" else _UTF16"Other" end as channel_name -- Channel name from ( select * , row_number() over(partition by order_id order by order_time desc ) as rn -- Remove duplicate order data , concat(substr("2021-03-25 12:03:00", 1, 10), " 00:00:00") as begin_time , concat(substr("2021-03-25 12:03:00", 1, 10), " 23:59:59") as end_time from trade_order_detail where pay_time >= concat(substr("2021-03-25 12:03:00", 1, 10), " 00:00:00") --Obtain the data of the current day. To accelerate running, 2021-03-25 12:03:00 is used to replace cast(LOCALTIMESTAMP as string). and real_pay is not null ) t where t.rn = 1; -- Collect data statistics by channel. insert into trade_channel_collect select begin_time --Start time of statistics collection , channel_code , channel_name , cast(COALESCE(sum(real_pay), 0) as double) as cur_gmv -- GMV , count(distinct user_id) as cur_order_user_count -- Number of payers , count(1) as cur_order_count -- Number of orders paid on the day , max(pay_time) as last_pay_time -- Settlement time , cast(LOCALTIMESTAMP as string) as flink_current_time -- Current time of the flink task from tmp_order_detail where pay_time >= concat(substr("2021-03-25 12:03:00", 1, 10), " 00:00:00") group by begin_time, channel_code, channel_name;
NOTE:
Job logic
- Create a Kafka source table to read consumption data from a specified Kafka topic.
- Create a result table to write result data into MySQL through JDBC.
- Implement the processing logic to collect statistics on each metric.
To simplify the final processing logic, create a view to preprocess the data.
- Use over window condition and filters to remove duplicate data (the top N method is used). In addition, the built-in functions concat and substr are used to set 00:00:00 as the start time and 23:59:59 of the same day as the end time, and to collect statistics on orders paid later than 00:00:00 on the day. (To facilitate data simulation, replace cast(LOCALTIMESTAMP as string) with 2021-03-25 12:03:00.)
- Based on the channels of the order data, the built-in condition function is used to set channel_code and channel_name to obtain the field information in the source table and the values of begin_time, end_time, channel_code, and channel_name.
- Collect statistics on the required metrics, filter the data as required, and write the results to the result table.
- Select the created general-purpose queue and submit the job.
- Wait until the job status changes to Running. Click the job name to view the details.
Figure 19 Job status
- Use the Kafka client to send data to a specified topic to simulate real-time data streaming.
For details, see Connecting to an Instance Without SASL.
Figure 20 Real-time data streaming
- Run the following command:
sh kafka_2.11-2.3.0/bin/kafka-console-producer.sh --broker-list KafKa connection address --topic Topic name
Example data is as follows:
{"order_id":"202103241000000001", "order_channel":"webShop", "order_time":"2021-03-24 10:00:00", "pay_amount":"100.00", "real_pay":"100.00", "pay_time":"2021-03-24 10:02:03", "user_id":"0001", "user_name":"Alice", "area_id":"330106"} {"order_id":"202103241606060001", "order_channel":"appShop", "order_time":"2021-03-24 16:06:06", "pay_amount":"200.00", "real_pay":"180.00", "pay_time":"2021-03-24 16:10:06", "user_id":"0001", "user_name":"Alice", "area_id":"330106"} {"order_id":"202103251202020001", "order_channel":"miniAppShop", "order_time":"2021-03-25 12:02:02", "pay_amount":"60.00", "real_pay":"60.00", "pay_time":"2021-03-25 12:03:00", "user_id":"0002", "user_name":"Bob", "area_id":"330110"} {"order_id":"202103251505050001", "order_channel":"qqShop", "order_time":"2021-03-25 15:05:05", "pay_amount":"500.00", "real_pay":"400.00", "pay_time":"2021-03-25 15:10:00", "user_id":"0003", "user_name":"Cindy", "area_id":"330108"} {"order_id":"202103252020200001", "order_channel":"webShop", "order_time":"2021-03-24 20:20:20", "pay_amount":"600.00", "real_pay":"480.00", "pay_time":"2021-03-25 00:00:00", "user_id":"0004", "user_name":"Daisy", "area_id":"330102"} {"order_id":"202103260808080001", "order_channel":"webShop", "order_time":"2021-03-25 08:08:08", "pay_amount":"300.00", "real_pay":"240.00", "pay_time":"2021-03-25 08:10:00", "user_id":"0004", "user_name":"Daisy", "area_id":"330102"} {"order_id":"202103261313130001", "order_channel":"webShop", "order_time":"2021-03-25 13:13:13", "pay_amount":"100.00", "real_pay":"100.00", "pay_time":"2021-03-25 16:16:16", "user_id":"0004", "user_name":"Daisy", "area_id":"330102"} {"order_id":"202103270606060001", "order_channel":"appShop", "order_time":"2021-03-25 06:06:06", "pay_amount":"50.50", "real_pay":"50.50", "pay_time":"2021-03-25 06:07:00", "user_id":"0001", "user_name":"Alice", "area_id":"330106"} {"order_id":"202103270606060002", "order_channel":"webShop", "order_time":"2021-03-25 06:06:06", "pay_amount":"66.60", "real_pay":"66.60", "pay_time":"2021-03-25 06:07:00", "user_id":"0002", "user_name":"Bob", "area_id":"330110"} {"order_id":"202103270606060003", "order_channel":"miniAppShop", "order_time":"2021-03-25 06:06:06", "pay_amount":"88.80", "real_pay":"88.80", "pay_time":"2021-03-25 06:07:00", "user_id":"0003", "user_name":"Cindy", "area_id":"330108"} {"order_id":"202103270606060004", "order_channel":"webShop", "order_time":"2021-03-25 06:06:06", "pay_amount":"99.90", "real_pay":"99.90", "pay_time":"2021-03-25 06:07:00", "user_id":"0004", "user_name":"Daisy", "area_id":"330102"}
- In the navigation pane on the left, choose Job Management > Flink Jobs, and click the job submitted in 3. On the job details page, view the number of processed data records.
Figure 21 Job details
Step 6: Querying the Result
- Log in to the MySQL instance by referring to 2 and run the following SQL statement to query the result data processed by the Flink job:
SELECT * FROM `dli-demo`.`trade_channel_collect`;
Figure 22 Querying results - Log in to the DLV console, configure a DLV screen, and run SQL statements to query data in the RDS for MySQL instance to display the data on the screen.
For details, see Editing Screens.
Figure 23 DLV screen
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot