Help Center/ Data Lake Insight/ Best Practices/ Connecting BI Tools to DLI for Data Analysis/ Configuring Superset to Connect to DLI for Data Query and Analysis
Updated on 2025-04-02 GMT+08:00

Configuring Superset to Connect to DLI for Data Query and Analysis

Superset is an open source platform for data exploration and visualization. It allows for fast and intuitive exploration of data, as well as the creation of rich data visualizations and interactive dashboards.

By connecting Superset to DLI, you can query and analyze data seamlessly. This streamlines the data access process, offers unified data management and analysis capabilities, and empowers you to uncover deeper insights from the data.

This section describes how to configure Superset to connect to DLI.

Preparations

  • Obtaining the DLI JDBC driver

    Download the dli-sdk-python driver from the DLI management console.

  • Preparing connection information
    Table 1 Connection information

    Item

    Description

    How to Obtain

    DLI AKSK

    AK/SK-based authentication refers to the use of an AK/SK pair to sign requests for identity authentication.

    Obtaining an AK/SK

    DLI's endpoint address

    Endpoint of a cloud service in a region.

    Obtaining an Endpoint

    DLI's project ID

    Project ID, which is used for resource isolation.

    Obtaining a Project ID

    DLI's region information

    DLI's region information

    Regions and Endpoints

Step 1: Install Superset and the Data Connection Driver

  1. Download and install Superset.

    For details, see Installing Superset.

    The following uses how to install Superset in Docker as an example:

    1. Install Docker on the current host.
    2. Pull the Superset image for Docker.
      docker pull apache/superset
    3. Start the Superset container.
      docker run -p 8088:8088 apache/superset

      Start the Superset container and map port 8088 of the container to port 8088 of the host machine.

    4. Access Superset.

      In the address box of a browser, visit http://IP address:8088 (where IP address is the IP address of the host where Superset is deployed) and log in to Superset using the username and password set during Superset installation.

  2. Install and configure the DLI driver in Superset to connect to the database.

    The driver must be placed in Superset's classpath, such as the superset-classpath directory.

    Extract the installation package and install the DLI driver in the Superset client.

    Run Python setup.py install to install the dli-sdk-python in the local environment.

    Figure 1 Installing the JDBC driver in the Superset client

  3. After the driver is installed and configured, restart the Superset service to ensure the installed driver takes effect.

Step 3: Configure Superset to Connect to DLI

In Superset, follow the steps below to add a database connection.

  1. Open and log in to Superset.
  2. Choose Settings > Database Connections and click + DATABASE.
    Figure 2 Choosing Settings
    Figure 3 Clicking + DATABASE
  3. In the dialog box that appears, set SUPPORTED DATABASES to DLI.
    Figure 4 Selecting DLI
  4. Configure data connection information.
    • DISPLAY NAME: Enter a data connection name.
    • SQL ALCHEMY URI: Enter the URL of the configuration data connection.

      Format of a data connection URL:

      dli://<accesskey_id>:<accesskey_secret>@<region_id>/?projectid=<project_id>&queuename=<dli_queue_name>&databasename=<dli_default_database_name>&enginetype=<engine_type>&catalog=< lakeformation_catalog_name>

      Table 2 Parameters for connecting Superset to DLI

      Parameter

      Mandatory

      Description

      Example Value

      accesskey_id and accesskey_secret

      Yes

      AK that acts as the authentication key.

      -

      region_id

      Yes

      Region name.

      ap-southeast-2

      projectid

      Yes

      ID of the project where DLI resources are.

      0b33ea2a7e0010802fe4c009bb05076d

      queuename

      Yes

      DLI queue name.

      dli_test

      databasename

      Yes

      Default database to be accessed.

      dli

      enginetype

      No

      DLI queue type. The options are:

      • spark: Spark queue.
      • hetuEngine: HetuEngine queue.

      The default value is spark.

      spark

      catalog

      No

      Metadata catalog name.

      • It is mandatory when a LakeFormation catalog is used. In this case, it indicates the name of the LakeFormation catalog used.

        When querying the LakeFormation catalog, there must be a default database under the catalog.

      • If left unset, a DLI catalog is used by default. You do not need to set this parameter when using the DLI catalog.

      For example, if you use a LakeFormation catalog named lfcatalog, the configuration is as follows: catalog=lfcatalog.

      Figure 5 Configuring URL information
  5. After filling in the connection information, click TEST CONNECTION to test whether the data source is successfully connected. If the message Connection looks good! is displayed, the connection is successful.
  6. If the test connection is successful, click CONNECT to establish a data connection with DLI.
  7. Click OK to save the connection.

Step 3: Query and Analyze Data Using Superset

  1. Viewing table information

    On the top menu of Superset, choose Datasets, click + DATASET on the right side of the displayed page, and set DATABASE, SCHEMA, and TABLE to preview the table information.

    Figure 6 Previewing table information in Superset

  2. Creating a dataset

    Click CREATE DATASET AND CREATE CHART in the lower right corner.

    Figure 7 CREATE DATASET AND CREATE CHART

  3. Visual analysis data

    On the dataset page, select the target table, and configure the icon type and dimension to display the data analysis chart.

    Figure 8 Visual analysis data using Superset