Help Center/ Data Lake Insight/ Best Practices/ DLI Security Best Practices
Updated on 2025-04-30 GMT+08:00

DLI Security Best Practices

Data Lake Insight (DLI) is a serverless big data service offering real-time, offline, and interactive data analysis. It is fully compatible with Apache Spark, Apache Flink, and HetuEngine ecosystems. It requires no server management and is ready for immediate use.

This section provides actionable best practices for enhancing DLI security. Based on them, you can continuously evaluate the security posture of your DLI resources, and enhance their security by combining multiple security features provided by DLI. This way, you protect your data stored in DLI against leakage and tampering—both at rest and in transit.

To secure your data and workloads on DLI, we recommend that you follow the best practices below:

  • Enhance permission management
  • Back up and restore data
  • Encrypt data at rest
  • Access other services using an agency
  • Enable log auditing
  • Upgrade compute engines to their latest versions

Enhancing Permission Management

DLI allows you to use IAM to implement fine-grained user permission management and better resource isolation.

With IAM, you can use your Huawei Cloud account to create IAM users for your employees, and assign permissions to the users to control their access to specific resource types. For example, you can create IAM users for some software developers in your organization to allow them to use DLI resources but not to delete resources.

Table 1 describes the DLI permission types. For details about the resources that can be controlled by DLI, see DLI Resources and Their Paths.

Table 1 DLI permission types

Permission Type

Subtype

Console Operation

SQL Syntax

Queue permission

Queue management permission

See Queue Permission Management.

None

Queue usage permission

Data permission

Database permission

See Configuring Database Permissions on the DLI Console and Configuring Table Permissions on the DLI Console.

See Data Permissions List.

Table permission

Column permission

Job permission

Flink job permission

See Configuring Flink Job Permissions.

None

Package permission

Package group permission

See Configuring DLI Package Permissions.

None

Package permission

Datasource authentication permission

Datasource authentication permission

See Datasource Authentication Permission Management.

None

Backing Up and Restoring Data

DLI provides APIs for data import and export. You can regularly export critical data to OBS for backup. In case of data anomalies or damage, you can use the import API to restore the backup data from OBS to DLI. It is advisable to perform data export and backup daily or weekly. If data is damaged, you can use the backup files to restore data, ensuring data reliability.

Encrypting Data at Rest

To enhance user data security, DLI allows you to store data tables using encrypted OBS buckets. It is advisable to use an encrypted OBS bucket when creating a DLI table to store sensitive data.

Accessing Other Services Using an Agency

Cloud services often interact with each other, with some of which dependent on other services. You can create an agency to delegate DLI to use other cloud services and perform resource O&M on your behalf. For example, the AK/SK required by DLI Flink jobs is stored in DEW. To allow DLI to access DEW data during job execution, you need to provide an IAM agency to delegate the permissions to perform operations on DEW data to DLI.

When Flink 1.15, Spark 3.3, or a later version is used to execute jobs, you can add information about the new custom agency to the job configuration. During the execution of Flink and Spark jobs, the job program can obtain the custom agency information you configured and use this agency to access other cloud services. You are advised to minimize the permissions of custom agencies by configuring only the necessary permissions required for Flink and Spark job operations.

Enabling Log Auditing

Security audit logs record all user operations on data. They can be used to analyze user behavior, generate compliance reports, and trace the root cause of an incident.

More information: Auditing DLI Using CTS

Enabling SQL Inspector

The big data platform uses SQL for data analysis and processing, making data analysis easier while also exposing certain issues, such as varying quality of SQL input statements, difficulty in localizing SQL problems, and excessive resource consumption by large SQL statements. Low-quality SQL statements can have unforeseen impacts on the data analysis platform, affecting system performance or platform stability.

DLI offers this feature to allow you to create inspection rules for the Spark SQL engine. This helps prevent common issues like large or low-quality SQL statements by providing pre-event information, blocking, and in-event circuit breaker. You do not need to change your SQL submission method or syntax, so it is easy to implement without affecting service operations.

More information: Creating a SQL Inspection Rule

Upgrading Compute Engines to Their Latest Versions

DLI may release new kernel versions for Spark and Flink to fix newly discovered vulnerabilities disclosed in the open-source community. To enhance usability and security, it is advisable to establish a regular version check mechanism and upgrade the engines used for big data jobs to their latest versions.

More information: Upgrading the Engine Version of a DLI Job