Updated on 2025-03-24 GMT+08:00

DLI Datasets

Datasets created from Data Lake Insight (DLI).

Prerequisites

The data source to be connected has been created. For details, see DLI Data Sources.

Building Data in DLI

  1. Log in to the DLI console.
  2. Create a queue, database, and table. For details, see Using DLI to Submit a SQL Job to Query OBS Data.

    For example, the tpch database is created, and the required table is preset in the database.

    Figure 1 tpch database

  3. Click the tpch database to access it.
  4. (Optional) Double-click the corresponding table to view its query statement on the right. Click Settings to add a label.

    You can identify the job by its label on Huawei Cloud Astro Canvas, where the execution result is displayed. If no label is set, the execution result of the latest job is displayed by default.
    Figure 2 Setting a label

  5. Click Execute to construct data.

    Figure 3 Build data

  6. In the navigation pane, choose Job Management > SQL Jobs to view the build result.

    Figure 4 Checking the build result

Creating a DLI Dataset

  1. Log in to Huawei Cloud Astro Canvas by referring to Logging In to Huawei Cloud Astro Canvas.
  2. Choose Data Center from the main menu.
  3. Choose Datasets > All in the navigation pane.
  4. On the Dataset Management page, click Create.
  5. Set the dataset name, specify the data type, data source, and folder, and click Save.

    Figure 5 Creating a DLI dataset
    • Dataset Name: A dataset is identified by its name. The name contains 1 to 60 characters, including letters, digits, and underscores (_).
    • Data Type: Select DLI.
    • Data Source: Select the data source created in DLI Data Sources.
    • Folder: Set the folder for storing the dataset. You can select the folder created in (Optional) Creating a Folder or click New Folder.
    • Owner: Creator of the dataset.
    • Description: Description of the new dataset, which is usually the function of the dataset.

  6. Configure dataset parameters.

    • Label: Label of an SQL job. Click the expand button next to Label and set key and value of the label, that is the label set in Building Data in DLI.
      • Label settings: Obtain data generated within the last 48 hours from the labeled SQL job (jobs expire after 48 hours).
      • If no label is set, data is obtained from the most recently executed job by default.
    • Click the refresh data button to preview data.

  7. Click Save. The DLI dataset is created.