Help Center/ Data Lake Insight/ Best Practices/ Connecting BI Tools to DLI for Data Analysis/ Configuring Power BI to Connect to DLI Using Kyuubi for Data Query and Analysis
Updated on 2025-02-26 GMT+08:00

Configuring Power BI to Connect to DLI Using Kyuubi for Data Query and Analysis

Power BI offers features such as data integration, warehousing, reporting, and visualization. It can transform complex data into visual charts and dashboards that are easy to understand and interact with, helping businesses make data-driven decisions.

Kyuubi is a distributed SQL query engine that allows you to access and analyze data using standard SQL APIs.

Connect Power BI to Kyuubi and use Kyuubi to access DLI for data query and analysis. This simplifies the data access process, provides unified data management and analysis capabilities, and gains in-depth data insights.

This section describes how to configure Power BI to connect to DLI using Kyuubi to access and analyze data in DLI.

Procedure

Figure 1 Process

Step 1: Install and Connect Kyuubi to DLI

To access Kyuubi from an external network, ensure that an EIP is bound to the ECS and ports 10009 and 3309 are enabled in the inbound rules of the security group.

  1. Install JDK.

    Install JDK in your development environment before you can install and use Kyuubi.

    Java SDKs require JDK 1.8 or later. To ensure compatibility with future versions, you are advised to use version 1.8.

    1. Download the JDK.

      Download the JDK 1.8 installation package from Oracle official website and install it.

      In this example, jdk-8u261-linux-x64.tar.gz is used.

    2. Upload the JDK package to the corresponding directory of the Linux server and run the extraction command. In this example, the package is uploaded to /usr/local.
      sudo tar -xzf jdk-8u261-linux-x64.tar.gz -C /usr/local/
    3. Configure environment variables.

      Add the following content to the .bashrc or .profile file:

      export JAVA_HOME=/usr/local/jdk-1.8.0_261
      export PATH=$PATH:$JAVA_HOME/bin
    1. Apply the environment variables.
      source ~/.bashrc
    2. Run java -version to check whether the installation is successful. If the version number is displayed in the command output, the Java environment is successfully installed.
      java version "1.8.0_261"
      Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
      Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)

  2. Install Kyuubi.

    1. Download the Kyuubi installation package. Refer to Deploying Kyuubi to learn more how to install Kyuubi.
    2. Extract the downloaded package.
      tar -xzf kyuubi-{version}-bin.tar.gz
    3. (Optional) Configure environment variables.

      Add the bin directory of Kyuubi to PATH environment variables to ensure that Kyuubi scripts can be called at any location.

  3. Connect Kyuubi to DLI.

    1. Add the DLI driver to the root directory of Kyuubi.

      Place the driver in Kyuubi's root directory /externals/engines/jdbc.

      Ensure that the user group and permissions of the plug-in are the same as those of other JAR files.

    2. Modify the Kyuubi configuration file.

      cd $KYUUBI_HOME/confvi kyuubi-defaults.conf

      Table 1 describes the parameters.

      Table 1 Kyuubi parameters

      Parameter

      Description

      Mandatory

      Example Value

      kyuubi.engine.type

      JDBC service type. Set it to dli.

      Yes

      jdbc

      kyuubi.engine.jdbc.type

      Engine type. Set it to dli.

      Yes

      dli

      kyuubi.engine.jdbc.driver.class

      Name of the driver class used to connect JDBC. Set it to com.huawei.dli.jdbc.DliDriver.

      Yes

      com.huawei.dli.jdbc.DliDriver

      kyuubi.engine.jdbc.connection.url

      URL used to connect JDBC.

      Format: jdbc:dli://{dliendpoint}/{projectId}.

      Yes

      jdbc:dli://{dliendpoint} /{projectId}

      kyuubi.engine.jdbc.session.initialize.sql

      Initialization SQL statements executed when a JDBC session is established.

      No

      select 1

      If select 1 is displayed on the DLI management console, the initialization is successful.

      kyuubi.frontend.protocols

      Frontend protocol supported by Kyuubi. Kyuubi supports various frontend protocols, allowing you to interact with Kyuubi through different APIs.

      Yes

      • mysql
      • thrift_binary

      kyuubi.engine.dli.schema.show.name

      How the Kyuubi engine shows the schema name of data source APIs when you run show schemas or show databases.

      • true: The DLI name is included as the prefix when the schema name is shown.
      • false: The DLI name is not included when the schema name is shown.

      For example, if it is set to true and the DLI name is hive, the output is in the hive.default format when you run show schemas.

      If it is set to false, the output is in the default format.

      No

      • true
      • false

      kyuubi.engine.dli.jdbc.connection.region

      Region name.

      Yes

      regionname=ap-southeast-2

      kyuubi.engine.dli.jdbc.connection.queue

      DLI queue name.

      Yes

      dli_test

      kyuubi.engine.dli.jdbc.connection.database

      Default database name used when the Kyuubi engine connects to the DLI data source through JDBC.

      Yes

      tpch

      kyuubi.engine.dli.jdbc.connection.ak

      AK that acts as the authentication key.

      Set it if authenticationmode is aksk.

      Yes

      accesskey=your-access-key

      kyuubi.engine.dli.jdbc.connection.sk

      Region name.

      Set it if authenticationmode is aksk.

      Yes

      secretkey=your-secret-key

      kyuubi.engine.dli.jdbc.connection.project

      ID of the project where DLI resources are.

      Yes

      0b33ea2a7e0010802fe4c009bb05076d

      kyuubi.engine.dli.sql.limit.time.sec

      Execution duration limit of a SQL query.

      The default value is 600 seconds.

      No

      300

      kyuubi.engine.dli.result.line.num.limit

      Maximum number of data records returned for a SQL query.

      By default, 100,000 data records are returned.

      Setting it to -1 means that the number of returned data records is not limited.

      Yes

      50000

      kyuubi.engine.dli.small.file.merge

      Whether to enable automatic small file merging. The default value is false, indicating that the function is disabled.

      • true: Enable
      • false: Disable

      Yes

      true

      kyuubi.engine.dli.bi.type

      BI tool type.

      The options are fine, grafana, superset, tableau, power, dbt, and yongHong.

      Yes

      fine

      kyuubi.engine.dli.boolean.type.to.int

      Whether DLI's Boolean data is returned as 1/0 or true/false.

      If the BI tool type is grafana, set this parameter to true.

      • true: returned as 1/0 (1: true; 0: false)
      • false: returned as true/false.

      The default value is false.

      No

      false

      kyuubi.engine.dli.set.conf.transform.to.annotation

      Allows you to set set spark in SQL statements.

      Set it to true for Power BI, FineBI, Superset, and DBT.

      No

      true

      kyuubi.engine.dli.set.conf.sql.suffix

      Allows you to set set spark at the end of SQL statements.

      Set it to true for Power BI and DBT.

      No

      true

      kyuubi.engine.dli.result.cache.enable

      Whether to enable database table data caching. Once enabled, database table data is automatically cached. The default value is true.

      • true: Enable
      • false: Disable

      No

      true

      kyuubi.engine.dli.cache.limit.line.num

      Maximum number of data records that can be cached.

      By default, 100,000 data records are cached.

      Setting it to -1 means that the number of cached data records is not limited.

      No

      1000

      kyuubi.engine.dli.cache.time.sec

      Cache time.

      The default value is 1800 seconds.

      No

      1800

      kyuubi.operation.incremental.collect

      Kyuubi pre-loads select result data into the cache to accelerate data reading. You are advised to disable this function to avoid memory OOM issues when dealing with large volumes of data.

      No

      false

      Setting it to false indicates disabling preloading.

      kyuubi.engine.jdbc.memory

      Memory of the JDBC engine process.

      The default value is 1g. You are advised to set it to at least 5g to increase memory for the JDBC engine process.

      No

      5g

    3. Quickly start Kyuubi.

      Go to the root directory /bin of the ECS and run the following command to start Kyuubi:

      cd /bin
      ./kyuubi start  restart

      Once the start is successful, run SQL statements to check whether the connection between Kyuubi and DLI is normal.

  4. (Optional) Configure the hosts file to speed up access to Kyuubi.

    To make access to Kyuubi faster, you are advised to add the Kyuubi host IP address to the /etc/hosts file on the host.

    1. Run ifconfig to check the IP address of the host.
      Figure 2 Checking the host IP address
    2. Add the IP address to the /etc/hosts file.
      Figure 3 Adding the host IP address to the /etc/hosts file

Step 2: Connect ODBC to Kyuubi

  1. Install the ODBC driver.

    Install the ODBC driver on the local host based on the database type. In this example, the Hive database is used.

  2. Connect ODBC to Kyuubi.

    1. In Windows, click Control Panel, double-click Administrative Tools, and double-click Data Sources (ODBC).
    2. Configure an ODBC data source.
      1. Click User DSN.
      2. Click Add.
      3. Select Hive ODBC Driver and click OK.
      Figure 4 Creating an ODBC data source connection
    3. In the dialog box that appears, enter information about the Kyuubi server.
      • Database: Enter the DLI database name.
      • Host(s): Enter the EIP of the Kyuubi server.
      • Port: listening port of Kyuubi. The Hive Thrift protocol is used. The default port is 10009.
      • User Name and Password: Configure the username and password of the Kyuubi server as required.

      Set other advanced options and save the settings.

      Figure 5 ODBC data source connection parameters
    4. Click Test to check whether the data source connection is successful. If successful, click OK to save the connection.

Step 3: Connect Power BI to Kyuubi Using ODBC

  1. Click and install Power BI. Download the Power BI installation package.
  2. Start Power BI Desktop.
  3. On the Home tab, click Get data.
  4. Click More... to see other types of data sources available.
  5. Select ODBC from the list as the data source type and click Connect.
  6. In the ODBC Driver Manager window that appears, select the ODBC data source configured in Step 2: Connect ODBC to Kyuubi and click OK.

    Power BI connects to Kyuubi using ODBC and allows you to preview and select tables and views in the database.

    When previewing database tables, select limit. Otherwise, all partitioned tables are scanned.

Common Operations: Setting SQL Job Parameters

Select a configuration method based on the type of the installed ODBC driver.

  • Use the Cloudera Hive ODBC (v2.5.12) driver.
    You only need to add annotation parameters to the end of SQL statements.
    -- @set Parameter
    Example:
    -- @set dli.sql.current.database=tpch
    -- @set dli.sql.shuffle.partitions=100
  • Use the Microsoft Hive ODBC (v2.6.12.1012) driver.
    1. Make sure that the following parameters are enabled in the /conf/kyuubi-defaults.conf file of Kyuubi:
      kyuubi.engine.dli.set.conf.transform.to.annotation=true
      kyuubi.engine.dli.set.conf.sql.suffix=true
    2. Add annotation parameters to the end of SQL statements.
      set Parameter
      Example:
      set dli.sql.current.database=tpch
      set dli.sql.shuffle.partitions=100

The execution result in the SQL Editor of DLI is as follows:

Figure 6 Viewing the configured parameters in the SQL Editor of DLI