Updated on 2025-09-08 GMT+08:00

Use Cases

DLI is applicable to large-scale log analysis, federated analysis of heterogeneous data sources, and big data ETL processing.

Large-scale Log Analysis

  • Gaming operations data analysis

    Different departments of a game company analyze daily new logs via the game data analysis platform to obtain required metrics and make decision based on the obtained metric data. For example, the operation department obtains required metric data, such as new players, active players, retention rate, churn rate, and payment rate, to learn the current game status and determine follow-up actions. The placement department obtains the channel sources of new players and active players to determine the platforms for placement in the next cycle.

  • Advantages
    • Efficient Spark programming model: DLI directly ingests data from DIS and performs preprocessing such as data cleaning. You only need to edit the processing logic, without paying attention to the multi-thread model.
    • Ease of use: You can use standard SQL statements to compile metric analysis logic without paying attention to the complex distributed computing platform.
    • Pay-per-use: Log analysis is scheduled periodically based on time-critical requirements. There is a long idle period between every two scheduling operations. DLI adopts the pay-per-use billing mode, which saves the cost by more than 50% compared with the dedicated queue mode. DLI only bills you for the resources used for scheduling.
  • It is recommended that you use the following related services:

    OBS, DIS, GaussDB(DWS), and RDS

Figure 1 Gaming operations data analysis

Federated Analysis of Heterogeneous Data Sources

  • Digital service transformation of automotive enterprises

    Facing new competition pressures in the market and continuous transformation of travel services, automotive enterprises build the IoV cloud platform and vehicle OS to streamline Internet applications and vehicle use cases and complete digital transformation for automotive enterprises. In this way, they can provide better smart travel experience for vehicle owners, improve their competitiveness, and promote sales growth. For example, DLI can be used to collect and analyze daily vehicle indicator data (such as battery, engine, tire pressure, and airbag health status) and provide maintenance suggestions to vehicle owners in a timely manner.

  • Advantages
    • No need for migration in multi-source data analysis: RDS stores the basic information about vehicles and vehicle owners, table store CloudTable saves real-time vehicle location and health status, and GaussDB(DWS) stores periodic metric statistics. DLI allows federated analysis on data from multiple sources without data migration.
    • Tiered data storage: Car companies need to retain all historical data to support auditing and other services that require infrequent data access. Warm and cold data is stored in OBS and frequently accessed data is stored in CloudTable and GaussDB(DWS), reducing the overall storage cost.
    • Rapid and agile alarm triggering: There are no special requirements for the CPU, memory, hard disk space, and bandwidth.
  • It is recommended that you use the following related services:

    DIS, CDM, OBS, GaussDB(DWS), RDS, and CloudTable

Figure 2 Digital service transformation for car companies

Big Data ETL Processing

  • Carrier big data analysis

    Carriers typically require petabytes, or even exabytes of data storage, for both structured (base station details) and unstructured (messages and communications) data. They need to be able to access the data with extremely low data latency. It is a major challenge to extract value from this data efficiently. DLI provides multi-mode engines such as batch processing and stream processing to break down data silos and perform unified data analysis.

  • Advantages
    • Big data ETL: You can enjoy TB to EB-level data governance capabilities to quickly perform ETL processing on massive carrier data. Distributed datasets are provided for batch processing.
    • High throughput and low latency: The Apache Flink dataflow model is used with high-performance compute resources to consume data from user-created Kafka, MRS-Kafka, and DMS-Kafka. A single CU can process 1,000 to 20,000 messages per second.
    • Fine-grained permission management: For example, within company P there are N sub-departments that require both data sharing and isolation among them. DLI enables compute resource isolation per tenant to guarantee job SLAs. It also facilitates data permission controls down to the table/column level, assisting businesses in achieving interdepartmental data sharing and effective permission management.
  • It is recommended that you use the following related services:

    OBS, DIS, and DataArts Studio

Figure 3 Carrier big data analysis

Geographic Big Data Analysis

  • Geographic Big Data Analysis

    Geographic big data usually has a large data volume. For example, global satellite remote sensing images might take up to petabytes of data. Besides, there are various types of data, including structured remote sensing image grid data, vector data, unstructured spatial location data, and 3D modeling data. For this scenario, efficient mining tools or methods are essential.

  • Advantages
    • Spatial Data Analysis Operators: With full-stack Spark capabilities and rich Spark spatial data analysis Spatial Data Analysis Operators With full-stack Spark capabilities and rich Spark spatial data analysis algorithm operators, DLI delivers comprehensive support for real-time processing of dynamic streaming data with location attributes and offline batch processing. DLI can handle massive data, including structured remote sensing image data, unstructured 3D modeling, and laser point cloud data.
    • CEP SQL: DLI delivers geographical location analysis functions to analyze geospatial data in real time. You can fulfill yaw detection and geo-fencing through SQL statements.
    • Big Data Processing: DLI allows you to quickly migrate remote sensing image data at the TB to EB scale to the cloud and perform image data slicing to offer resilient distributed datasets (RDDs) for distributed batch computing.
  • It is recommended that you use the following related services:

    DIS, CDM, DES, OBS, RDS, and CloudTable

Figure 4 Geographic Big Data Analysis